For many of us, when we consume an API in our application, we tend to go with something similar to writing functions that encompass the happy path, such as:
// Fetch some remote contents function getIp () { return json_decode(file_get_contents('http://httpbin.org/ip')); }
function getIp () { return fetch('http://httpbin.org/ip').then(res => res.json()) }
;; Fetch some remote contents (require '[cheshire.core :as json]) (defn get-ip [] (-> (slurp "http://httpbin.org/ip") json/parse-string))
However, that leaves a lot to be desired - in fact, if that's all you wish to do in your software when consuming a resource you don't control, you may as well stop now and just call 'curl | jq .' in a bash script (not that there's anything wrong with that - it has it's place, but usually not as a deeper part of a larger system).
There are many things that could go wrong:
As such, it's our duty to ensure that we handle all possible paths, not just how the API performed on a good day of testing (afterall, if you were writing in a language with manual memory allocation, you wouldn't tend to malloc without confirming you claimed the memory you asked for, before proceeding - would you?).
What you may notice is that in some teams, the prioritization of "handled errors" tends to work in the opposite order that bullet list read (a team may handle non 2xx status codes if you're lucky - if you're very lucky, they may even handle consideration for timeouts - if you're close to a lotto winner, they also actually verify the format of the remote response!).
However - if you work in the opposite order (strong assertions on schema / properties / keys for your "good" responses) you can end up with most of the third item (non 2xx) taken care of for you for free (if the remote gives data in an unacceptable format, you immediately know at the least to trigger the error conditions showing the data is in a bad state).
The response you get today may not be the same tomorrow! It's very useful to ensure that you define your known good data format sooner rather than later.
The sample URL given earlier was for http://httpbin.org/ip, which is a convenient testing API, that will return a JSON encoded response with a single key, "origin", of the string type, that contains the requesting system's IP address.
Lets model that out in a data model or schema based structure (just for the heck of it, I'm going to aim for a functional and immutable style of code in most the samples).
// f,g = f(g(x1, x2, xN...)) function compose ($f, $g) { return function () use ($f, $g) { return $f(call_user_func_array($g, func_get_args())); }; } $assert = function (string $message = 'Bad Argument') { return function ($x) use ($message) { if (false === (bool) $x) throw new \InvalidArgumentException($message); }; }; $isIpFormat = function (string $s) { return preg_match('/^\d+\.\d+\.\d+\.\d+/', $s); }; $assertIpFormat = compose($assert('Invalid IP format'), $isIpFormat); class IpModel { private $ip; // Builds a model from JSON. public function __construct(string $json) { $this->unserialize($json); } public function setIp (string $ip) { global $assertIpFormat; $assertIpFormat($ip); $this->ip = $ip; } public function unserialize(string $json) { $tmp = json_decode($json); $this->setIp($tmp->origin); } } $getIp = function () { return file_get_contents('http://httpbin.org/ip'); }; $makeIpModel = function (string $json) { return new IpModel($json); }; $getIpModel = compose($makeIpModel, $getIp); $model = $getIpModel(); var_dump($model);
const fetch = require('node-fetch') const compose = (f, g) => (...args) => f(g.apply(g, args)) const _throw = s => { throw new Error(s) } const assert = message => x => { return false === Boolean(x) ? _throw(message) : undefined } const isIpFormat = s => /^\d+\.\d+\.\d+\.\d+/.test(s) const assertIpFormat = compose(assert('Invalid IP format'), isIpFormat) class IpModel { constructor (json) { this.unserialize(json) } setIp (ip) { assertIpFormat(ip) this.ip = ip } unserialize (json) { const tmp = JSON.parse(json) this.setIp(tmp.origin) } } const getIp = _ => fetch('http://httpbin.org/ip').then(res => res.json()).then(JSON.stringify) const makeIpModel = json => new IpModel(json) const getIpModel = _ => getIp().then(makeIpModel) getIpModel().then(console.log)
;; clojure.spec is a much better way to do this, but this will suffice for a simpler sample (defn is-ip-format? [s] (re-find #"^\d+\.\d+\.\d+\.\d+" s)) (defn -assert [message] #(when (not %) (throw (Throwable. message)))) (def assert-ip-format (comp (-assert "Invalid IP format") is-ip-format?)) (defn get-ip [] (slurp "http://httpbin.org/ip")) (defn make-ip-model [json ] (let [tmp (json/parse-string json) ip (get tmp "origin")] (assert-ip-format ip) {:ip ip})) (def get-ip-model (comp make-ip-model get-ip)) (get-ip-model)
Woah! That was way more work than just doing the simple few line call - why go to all that trouble for data portions you may or may not actually be consuming? We had to write is-clauses, assertions to extend them, and ensure each property or key in our data structure was conformant!
Well - you want to ensure that the data you are asking for is what you actually received. If you were a middle-man dealing in buying and reselling goods off of Ebay or something, other than factory sealed goods (which you may consider at a higher trust level than goods sold by your peers) you would probably be inclined to inspect the contents you buy before reselling them (ie, if your API was a middleware or something). This would be even more true if you intended to actually use some part(s) of those goods for your own purposes (in this sample - referring to the IP for some future computation etc.),
Some languages / clients have this easy - others do not. In either case, you can avoid making drastically different code branches/paths in your workflow by thinking it through in a logical fashion as such (obviously, more robustness with retry loops etc. is best, but more cumbersome and not everyone has that kind of time).
You could do so in some way, similar to this (assume we extend the aforementioned code from above):
$fakeResponse = function () { return json_encode(['origin' => 'xxx']); }; $getIp = function () { return @file_get_contents('http://httpbin.org/delay/10'); }; $getIpWithTimeout = function () { global $getIp, $fakeResponse; ini_set('default_socket_timeout', 1); $raw = $getIp(); // Here, we re-join the model path to make use of it's property assertion logic. return false === $raw ? $fakeResponse() : $raw; }; $makeIpModel = function (string $json) { return new IpModel($json); }; $getIpModel = compose($makeIpModel, $getIpWithTimeout); $model = $getIpModel(); var_dump($model);
const fakeResponse = _ => JSON.stringify({ origin: 'xxx' }) const getIp = _ => fetch('http://httpbin.org/delay/10').then(res => res.json()).then(JSON.stringify) const getIpWithTimeout = _ => { return new Promise((resolve, _) => { const to = setTimeout(_ => { return resolve(fakeResponse()) }, 1e3) getIp().then(response => { clearTimeout(to); return resolve(response) }) }) } const makeIpModel = json => new IpModel(json) const getIpModel = _ => getIpWithTimeout().then(makeIpModel) getIpModel() .then(console.log) .catch(console.log) .then(_ => p.exit())
(defn fake-response [] "{\"origin\": \"xxx\"}") ;; This could probably be a bit more readable with an actual promise or something. (defn get-ip-with-timeout [] (let [response (atom nil)] (future (Thread/sleep 1e3) (reset! response (fake-response))) (future (reset! response (get-ip))) (while (= nil @response) (Thread/sleep 100)) @response)) (def get-ip-model (comp make-ip-model get-ip-with-timeout)) (get-ip-model)
"That's preposterous!" you may shout. "How can your consumers of this model / data structure know the endpoint timed out!?" - well, you would usually augment your call-outs with logging / pro-active ping solutions (I recommend Riemann, or even a Slack channel/bot).
To your code-callers, it often is not of much value to take one function and branch it out into 1 good response, and tens of error condition paths, or every time someone wants to call your code, they have to handle all N paths. In this way, you're taking many failure cases, and massaging them into (at most) 2 paths the callers must check on (a good response, or a response that failed due to not meeting the model's data constraints).
At the least, what matters is etching out this path to begin with (and keeping cognizant of the fact that sometimes remote timeouts / unexpected formats can and do happen).
From here out, you now have the concept of a way to unify multiple error branches into one - expand what you've got and ensure you cover all the various error branches by not letting them be uncaught errors at your API level.
By following these steps (or at least keeping them in the back of your mind somewhere) you've succeeded in making some unknowns / non-deterministic inputs slightly more deterministic and guaranteed (at least, as far as anything that consumes your API interaction code is concerned).