free energy -- Webhooks and generators

contact: wefatherley at gmail dot com


A recent issue I worked on is developing a set of MVC services that integrate Yodlee's Webhooks API workflow. This API makes it is possible to gather financial data in an automated way, which is tremendously convenient for asset managers, investment advisors, and many other financial institutions. Data such as individual transactions, account status, and so forth are available just like other Yodlee APIs, except the Webhooks API has the following workflow for gathering these types of data:
W3Schools
As can be seen, the Webhooks API will shoot off notifications to your backend when new or updated data is available, and each notification will contain a set of hrefs that your backend can GET, which returns JSON with new or updated data. When a notification arrives, it is of the following form:

{
    "event": {
        "notificationId": "64b7ed1a-1530523285",
        "notificationTime": "2017-11-10T11:18:43Z",
        "info": "DATA_UPDATES.USER_DATA",
        "data": {
            "userCount": 1,
            "fromDate": "2017-11-10T10:18:44Z",
            "toDate": "2017-11-10T11:18:43Z",
            "userData": [
                {
                    "user": {
                        "loginName": "YSL1484052178554"
                    },
                    "links": [
                        {
                            "methodType": "GET",
                            "rel": "getUserData",
                            "href": "dataExtracts/userData?fromDate=...&toDate=...&loginName=..."
                        }
                    ]
                }
            ]
        }
    }
}

In a production setting, this notification arrives and needs to be parsed to fetch the new or updated financial data. Specifically, for each object in the userData array, we must make a visit to each href in the links array. Thus our iteration over the notification is something like this:

for user in notifcation["event"]["data"]["userData"]:
    for link in user["links"]:
        # GET the href, and process the result

Each GET yields a JSON object with new or updated data that looks like this:

{
    "userData": [
        {
            "user": {
                "loginName":  "yslasset1"
            },
            "providerAccount": [
                {
                    "id": 10655515,
                    "aggregationSource": "USER",
                    "lastUpdated": "2017-02-20T10:18:46Z",
                    "providerId": 16441,
                    "isDeleted": false,
                    ...
                }
            ],
            "account": [
                {
                    "CONTAINER": "bank",
                    "lastUpdated": "2017-02-20T10:18:59Z",
                    "isDeleted": false,
                    "id": 1111901803,
                    "createdDate": "2017-01-10T13:38:10Z",
                    "providerAccountId": 10655515,
                    ...
                }
            ],
            "transaction": [
                {
                    "accountId": 1111901802,
                    "isDeleted": false,
                    "status": "POSTED",
                    "CONTAINER": "bank",
                    "categoryId": 27,
                    "date": "2017-02-20",
                    "type": "DEPOSITS_CREDITS",
                    "amount": {
                        "amount": 343465,
                        "currency": "USD"
                    },
                    "baseType": "CREDIT",
                    ...
                }
            ],
            "holding": [
                {
                    "id": 1725230,
                    "accountId": 1111901806,
                    "description": "CDDesc",
                    "holdingType": "CD",
                    "isDeleted": false,
                    "interestRate": 2000,
                    "providerAccountId": 10655515,
                    ...
                }
            ]
        }
    ]
}

and it is this data that can be used for business purposes. A caviet though is that, if there's more than 500 transactions in the transaction array, the response's link header will contain another href to the remaining transactions. We need to build this logic into the nested for-loops we wrote above, and we can do this with a generator! An advantage to this approach is that the generator is an indefinite iterator that will yield all the available data, and any data referenced by link headers all by itself. The backend I wrote this for uses aiohttp, so this generator looks something like:

async def get_link_data(connection, href, user_id):
    while True:
        response = await connection.request(
            "GET", href, user_id = user_id
        )
        data = await response.json()
        yield data["userData"].pop()
        if "next" in response.links:
            href = response.links["next"]["url"]
        else: return

As you can see, this generator will take a href and yield it's data. Then if the response has a link header (specified by aiohttps's response.links attribute), it will catch the corresponding href, GET it, and return that data before moving on in the original event notification. Yay! Thus the webhooks workflow (aside from subscribing to the Yodlee's Webhooks API) is nicely sumarized by this following block:

for user in notifcation["event"]["data"]["userData"]:
    for link in user["links"]:
        user_data = get_link_data(a_conn_obj, link["href"], some_user_id)
        async for extract in user_data:
            # insert new/updated data into a database!