Dartmouth API Developer Portal
Filtering and Paging
The DartAPI service implements a common filtering technique across most APIs that gives a rich set of searching capabilities. Filtering allows an API consumer to scan through resources, and page through large collection results.
NOTE: The current implementation of filtering does not check if the attribute being queried on is available to the consumer. In those cases the query term is discarded without error and returns results as if the term were not entered at all. Some typical examples where this can be encountered is incorrectly specifying an attribute name because of a typo, or querying on an attribute that is not available under the current security scopes. A future version of the filtering will perform validation on the query string inputs to raise error conditions in these situations.
If the result of a filter request is expected to contain a large number of objects, you should be prepared to make multiple requests to the API, returning pages
of objects. Results from a filter request (called the found set
) are temporarily cached so that you may request batches
of objects as pages from that same found set over a period of time. Currently, the result set is cached for 12 hours.
The typical sequence used when working with subsets is to submit your filter request and then make subsequent requests for different pages of results from the found set. The first request to filter returns with two response headers set to faciliate this paging. See the description for Filter Request
below.
The number of objects returned in a single request is specified with the pagesize
parameter. The default pagesize is 100.
To ensure that subsequent requests for pages are returned from your found set
, you must provide the continuation_key
parameter with the value returned as the X-Request-ID
header in the first filtering call to the API.
See the examples below for the process of requesting a filtered list of people and retrieving those results via multiple requests.
NOTE: By default all searching is case-sensitive, however it is easy to turn on case insensitive searching by using the
*
wildcard character in the search term.
Multiple filter attributes are logically ANDed together when matching the filters, for example:
affiliations.name=Student&account_status=Active
returns the subset of all People who have Student
in their affiliations attribute AND an account_status attribute value of 'Active'.
Query Parameter Syntax
URI query parameters support the following operators; operators that support wildcard characters can cause comparisons to be case-insensitive:
Operator | Description | Wildcard Support | Example | Returns |
---|---|---|---|---|
= | equal | Yes | ?last_name=Smith | records with last_name equal to 'Smith' (case-sensitive) |
?name=*smith | records with name ending in smith (case-insensitive) | |||
=! | not equal | No | ?last_name=!Smith | records with last_name not equal to 'Smith' |
=| | equal OR list | Yes | ?affiliations.name=|Staff,Faculty | records with either 'Staff' or 'Faculty' in affiliations |
?netid=|D35000G*,F00002S* | records with netid 'd35000g' or 'f00002s' (case-insensitive) | |||
=^ | equal AND list | Yes | ?affiliations.name=^Staff,Student | records with both 'Staff' and 'Student' in affiliations |
?affiliations.name=^staff*,student* | records with both 'Staff' and 'Student' in affiliations (case-insensitive) | |||
=> | greater than | No | ?last_name=>Smith | records with last_name greater than 'Smith' |
=< | less than | No | ?last_name=<Smith | records with last_name less than 'Smith' |
All attributes associated with a resource (e.g., people) are supported as query parameters; dot notation is used to specify nested attributes (e.g. affiliations.name).
When muliple query parameters are specified, they are ANDed together, for example:
https://api.dartmouth.edu/api/people?affilitations.name=|Faculty,Staff&first_name=Al*&first_name=*an&last_name=|Smith,*jones*
returns records where:
- affilitions.name equals either 'Faculty' OR 'Staff' AND
- first_name starts with 'Al' (case-insensitive) AND
- first_name ends with 'an' (case-insensitive) AND
- last_name equals 'Smith' (case-sensitive) OR contains 'jones' (case-insensitive)
Querying null values and empty arrays
Any attribute that can have a value of null can be queried for the presence or absence of null. For example, ?middle_name=null will return all records where middle_name is null; ?middle_name=!null will return all records where middle_name is not null. Note that this feature causes any search for the string value "null" to fail; however, if a wildcard is used (e.g middle_name=null*) values that contain "null" will be selected.
Any attribute that is an array can be queried for an empty or non-empty array. For example, ?dplans=[] or ?dplans=![]. Note that you cannot specify any values within the square brackets. To query an array for the presence of one or more values, =| can be used (e.g. ?affiliations=|Staff,Student). To query for the presence of all specified values, =^ can be used.
Example Filter Request
Request a subset of People to be returned matching the supplied filters:
/api/people?account_status=Active&affiliations.name=|Staff,Faculty
Parameters
Parameter Name | Type | Description | Required |
---|---|---|---|
attribute_name | string | the attribute to filter on, nested attributes use dot notation such as telephone_numbers.data_source=sis | Yes |
page | integer | the page number to return when paging through the found set (starting at page 1) 1 thru n (default is 1) | No |
pagesize | integer | the number of objects to return on a page 1 thru n (default is 1000 and cannot be exceeded) |
No |
Returns
The filter request returns two important HTTP response headers that will assist you in paging through large datasets:
Response Header | Description | Example |
---|---|---|
X-Request-ID | The value used for the continuation_key query parameter in requests for more data from the found set |
7f133a20-2ecd-11e8-831d-062da25625b0 |
X-Total-Count | The number of people found via the specified filters | 3826 |
The filter request returns an collection
of objects. The details of the object will reflect the attributes allowed under the current logged in scopes.
The number of objects in this collection can vary, depending on one or more factors.
If the results of the filtering yields no objects found, an empty collection is returned. This can be confirmed by noting that the X-Total-Count
response header value is 0.
If the combination of page
and pagesize
results in positioning the paging mechanism past the end of the found set
, an empty collection is returned.
The last page of results may contain fewer than pagesize
objects if the page
requested is not an even multiple of pagesize
.
Subsequent Page Requests
Request a subsequent page of results from the found set
produced with the filter performed above. Note the use of the continuation_key query parameter containing the value returned in the X-Request-ID
response header from the request above. Increment the page
by one on each request until all pages have been retrieved.
Parameters
Parameter Name | Type | Description | Required |
---|---|---|---|
continuation_key | string | the value returned in the X-Request-ID header from the Filter Request |
Yes |
page | integer | the page number to return when paging through the found set (starting at page 1) 1 thru n (default is 1) | No |
pagesize | integer | the number of objects to return on a page 1 thru n (default is 1000 and cannot be exceeded) |
No |
Full Python Example Querying People
The following example shows how to query a large result set over multiple pages. It can be executed by any API key and does not require any security scopes.
import requests
import datetime
import os
import pprint
#
# *************************************************************************************************
# * simple logging to console
# *************************************************************************************************
#
def logit(log_msg):
print(datetime.datetime.now(),log_msg)
#
# *************************************************************************************************
# * get the jwt with any applicable scopes, check that all scopes requested are returned before
# * proceeding
# *************************************************************************************************
#
def login_jwt(login_url, api_key, scopes):
headers={'Authorization': api_key}
if scopes:
url = login_url + '?scope=' + scopes
else:
url = login_url
response = requests.post(url,headers=headers)
response.raise_for_status()
response_json = response.json()
jwt = response_json["jwt"]
accepted_scopes = response_json["accepted_scopes"]
logit("accepted scopes="+str(accepted_scopes))
if scopes:
for scope in scopes.split(' '):
if scope not in accepted_scopes:
raise Exception('A requested scope '+scope+' is not in the set of accepted scopes.')
return jwt
#
# *************************************************************************************************
# * This function will call the People API with a query string (or null string which gets all
# * people) and returns an array of results. The function will page through the api result
# * set 1000 records per page.
# *************************************************************************************************
#
def get_people_by_query(jwt, people_url, query_string):
headers={'Authorization': 'Bearer '+jwt,'Content-Type':'application/json'}
# set up initial values. Note that Dartmouth APIs only return 1000 results per page max
page_size = 1000
page_number = 1
headers={'Authorization': 'Bearer '+jwt,'Content-Type':'application/json'}
done = False
people = []
continuation_key = None
logit("get_people_by_query_string...")
while not done:
# first time through the loop, we supply the query string and paging parameters
# on subsequent requests we supply the continuation key and increment the page number
if page_number == 1:
url = people_url + "?"+ query_string + "&pagesize="+str(page_size)+"&page="+str(page_number)
else:
url = people_url + "?continuation_key="+continuation_key+"&pagesize="+str(page_size)+"&page="+str(page_number)
logit("calling people get with url="+url)
response = requests.get(url, headers=headers)
response.raise_for_status()
# on the first page of results get the header values of "x-request-id" and "x-total-count"
# from the returned payload. The x-request-id must be used on subsequent pages as a
# query parameter to keep this request separate from others. The x-total-count can be
# used as a sanity check on the total result set retrieved
if page_number == 1:
continuation_key = response.headers.get("x-request-id")
total_count = int(response.headers.get("x-total-count"))
logit("x-total-count="+str(total_count))
logit("x-request-id="+continuation_key)
response_list = response.json()
for i in range(len(response_list)):
people.append(response_list[i])
page_number = page_number + 1
if len(response_list) == 0:
done = True
# end while
if len(people) != total_count:
raise Exception("Number of people records retrieved ("+str(len(people))+") do not match total number in payload header "+str(total_count))
return people
#
# *************************************************************************************************
# * Main Program
# *************************************************************************************************
#
api_key = os.environ.get('API_KEY')
api_base_url = os.environ.get('API_BASE_URL')
people_url = api_base_url+"/people"
login_url = api_base_url+"/jwt"
# get a jwt for logging resource changes
logit("acquiring dartapi JWT...")
dartapi_jwt = login_jwt(login_url,api_key,None)
# get all people whose name starts with S
people = get_people_by_query(dartapi_jwt,people_url,"name=S*")
logit("number of people returned = " + str(len(people)))