BRICS QUERY TOOL APPLICATION PROGRAMMING INTERFACE

## Purpose: 
### The purpose of this notebook is to provide examples of the different endpoints for the Query Tool API using Python. 
### For additional questions, please refer to the API User Guide.


## Instructions:
 1. Import the Python Dependencies. 
 2. Authenticate user permissions and retrieve access token.
 3. Create and send a query for the specific endpoint.
 - If the response 200, it means that the query was submitted successfully. For more information about the status codes please refer to this link: https://httpstatuses.com/
 4. Save the data in a dataframe or in folder to perform analysis. 

In [None]:
#Import Python Dependencies
import pandas as pd
import requests
import json
import getpass
from io import StringIO
import os
import datetime as dt
import time
import sys

In [None]:
#create your folder for storing data
def create_folder(folder_path):
 adjusted_folder_path = folder_path
 folder_found = os.path.isdir(adjusted_folder_path)
 counter = 0
 while folder_found == True:
 counter = counter + 1
 adjusted_folder_path = folder_path + ' (' + str(counter) + ')'
 folder_found = os.path.isdir(adjusted_folder_path)
 os.mkdir(adjusted_folder_path)
 return adjusted_folder_path

x=dt.datetime.now()
new_dir = os.getcwd()+'\\'+"QueryDataFiles_"+x.strftime('%Y_%m_%d')
created_dir = create_folder(new_dir)

## AUTHENTICATION
### To log in the Query Tool API the user needs to log in and retrieve the access token that is used for subsequent endpoints. This service will authenticate a user's permission to use BRICS Query Tool API.


In [None]:
UserPassword = getpass.getpass("Enter your password")
UserUsername = input("Enter your username")

In [None]:
#login in to API 
url = "https://brics.nei.nih.gov/gateway/authentication/user/login"
headers = {
 'accept': 'text/plain',
 'Content-Type': 'application/x-www-form-urlencoded'
}

data = {'password':UserPassword,
 'username': UserUsername}

In [None]:
response = requests.post(url, headers=headers, data=data)

#login in check 
if response.status_code == 200:
 print("Login Successful")
 token=response.text
 print(f'Here is your token: {token}')
elif response.status_code != 200:
 print(response.status_code)
 print("Login not Successful. Please check username and password. If error still occurs reach out to system adminstrator. THIS CODE WILL NOT PROCEED")



# Study API

## The Study APIs return the following information:
1. Studies with data that the user has access to
2. Information about the a study or studies.
3. Studies associated with a form structure

## GET ALL STUDY INFORMATION
### This service will return all the studies that have data in the instance. Optional it will return information for a study with the Prefix ID (Study ID) provided. 


In [None]:
#get study prefix from title
url ="https://brics.nei.nih.gov/gateway/query-api/study"

headers = {
 'accept': 'application/json',
 'Content-type': 'application/json',
 'Authorization':'Bearer ' + token 
}


In [None]:
query = requests.get(url,headers =headers)
query


Output: JSON
The user will receive the following information in a JSON format:
1. Study Abstract
2. Study Status
3. Study Prefix ID
4. Study Title
5. Principal Investigator

In [None]:
output = query.json()
output

## Study Information for one study. Example NEI BRICS-STUDY0000205

In [None]:
#get study prefix from title
url ="https://brics.nei.nih.gov/gateway/query-api/study?prefixedId="
header = {
 'accept': 'application/json',
 'Content-type': 'application/json',
 'Authorization':'Bearer ' + token 
}

StudyPrefixID = input("Enter Study Prefix")


In [None]:
query = requests.get(url + StudyPrefixID,headers = header)
query

In [None]:
studyinfo = query.json()
studyinfo

## GET FORM STRUCTURES FOR A STUDY
### Returns all the form structures that have data submitted for the study. 
#### Example NEI BRICS-STUDY0000205

In [None]:
url = "https://brics.nei.nih.gov/gateway/query-api/form/study?prefixedId="

headers = {
 'accept': 'application/json',
 'Content-type': 'application/json',
 'Authorization':'Bearer ' + token 
}
studyid = input("Enter Study PrefixID")

In [None]:
query= requests.get(url + studyid, headers = headers)
query

Output Format: JSON 
The user will receive the following information in JSON format
1. Study ID
2. Forms associated with the Study
 - Form Structure Short Name
 - Form Structure title
 - Form Structure Version

In [None]:
studyformstructuredata = query.json()
studyformstructuredata



## GET ALL STUDIES ASSOCIATED WITH A FORM STRUCTURE
Returns all the studies that have data submitted to the form structure
Example eyeGENEDemographics

In [None]:
#get list of form structures

url = "https://brics.nei.nih.gov/gateway/query-api/study/form?formName="
header = {
 'accept': 'application/json',
 'Content-type': 'application/json',
 'Authorization':'Bearer ' + token 
}

formstructureshortname = input()

In [None]:
query = requests.get(url + formstructureshortname,headers=header)
query

Output Format: JSON
The user will receive the following information in JSON format:
1. Study Status
2. Study ID
3. Study Title
4. Study Abstract

In [None]:
formstructureinformation = query.json()
formstructureinformation

## GET DATA ELEMENTS FOR A FORM STRUCTURE
Return all data elements associated with the form structure. Example: eyeGENEDemographics

In [None]:
url = "https://brics.nei.nih.gov/gateway/query-api/dataElement/form/"

headers = {
# 'accept': 'application/json',
 'Content-type': 'application/json',
 'Authorization':'Bearer ' + token 
}

# print("Input Form Structure Short Name")
formstructureshortname = input()


In [None]:
dataelementapiquery = requests.get(url + formstructureshortname,headers = headers)
dataelementapiquery

Output Format: JSON
The user will receive the following information from the JSON.
1. Repeatable Group Name
2. Position in the Form Structure
3. Data Element ID
4. Name and Title of Data Element
5. Description of Data Element
6. Data Type
7. Input Restriction 
8. Required Type

In [None]:
dataelementapiinformation = dataelementapiquery.json()
dataelementapiinformation

## GET DATA FROM MULTIPLE FORM STRUCTURES WITHOUT DOING JOINS

In [None]:
multipleformsheader = {
 'Content-type': 'application/json',
 'Authorization':'Bearer ' + token
 }

multipleformsurl = "https://brics.nei.nih.gov/gateway/query-api/data/bulk/form/study"

In [None]:
multipleformsfilter ={
 "flattened": "false",
 "formStudies": [
 {
 "form": "eyeGENEDemographics",
 "studies": [
 "EYEGENE-STUDY0000203"
 ]
 },
 {
 "form": "eyeGENE_Solved",
 "studies": [
 "EYEGENE-STUDY0000203"
 ]
 }
 ] 
 ,
 "outputFormat": "csv"
}

In [None]:
multipleformsquery = requests.post(multipleformsurl,headers = multipleformsheader,json = multipleformsfilter)
multipleformsquery

In [None]:
multipleformsquery.headers

Ouput Format: Zipped CSV Files

User will recieve the individual csv files

In [None]:
#save the zipfile
import zipfile, io
z = zipfile.ZipFile(io.BytesIO(multipleformsquery.content))
z.extractall(created_dir)

In [None]:
#file information
import glob
files = sorted(glob.glob(created_dir + '/*.csv'))

for file in files:
 print("Here is the location of your files: " + file)
 print("_____________________________________________")



## Return multiple form structures for a study without a join

In [None]:
multipleformsstudy = {
 'accept':'application/zip',
 'Content-type': 'application/json',
 'Authorization':'Bearer ' + token
 }

multiplformsstudyurl ="https://brics.nei.nih.gov/gateway/query-api/data/bulk/study/form"

In [None]:
multipleformsstudyfilter = {
 "flattened": "false",
 "outputFormat": "csv",
 "studyForms": [
 {
 "forms": [
 "eyeGENE_Clinical"
 ],
 "study": "NEI_BRICS-STUDY0000207"
 },
 {
 "forms": [
 "PROWL_Demo_2"
 ],
 "study": "EYEGENE-STUDY0000204"
 } 
 ]
}

In [None]:
genomicsfilter = {
 "flattened": "false",
 "outputFormat": "csv",
 "studyForms": [
 {
 "forms": ["eyeGENEGenomics","eyeGENEDemographics"],
 "study": "EYEGENE-STUDY0000203"
 }
 ]
 
}

In [None]:
multipleformsstudyquery = requests.post(multiplformsstudyurl,headers = multipleformsstudy,json = multipleformsstudyfilter)
multipleformsstudyquery 

In [None]:
multipleformsstudyquery.headers

In [None]:
created_dir = create_folder(new_dir)
a = zipfile.ZipFile(io.BytesIO(multipleformsstudyquery.content))
a.extractall(created_dir)

In [None]:
#file information
import glob
files = sorted(glob.glob(created_dir + '/*.csv'))

for file in files:
 print("Here is the location of your files: " + file)
 print("_____________________________________________")



## Return data with a filter


In [None]:
queryurl ="https://brics.nei.nih.gov/gateway/query-api/data/csv"

headers = {
 'accept': 'application/csv',
 'Content-type': 'application/json',
 'Authorization':'Bearer ' + token }

In [None]:
genomicsfilter1 = {
 "formStudy": [
 {
 "form": "eyeGENEGenomics",
 "studies": ["EYEGENE-STUDY0000203"]
 },
 {
 "form": "eyeGENEDemographics",
 "studies": ["EYEGENE-STUDY0000203"]
 },
 ],
 "filter": [
 {
 "dataElement": "HGNCGeneSymbl",
 "form": "eyeGENEGenomics",
 "repeatableGroup": "Genomics Information",
 "operator":"OR",
 "value": [
 "ABCA4"
 ]
 },
 {
 "dataElement": "HGNCGeneSymbl",
 "form": "eyeGENEGenomics",
 "repeatableGroup": "Genomics Information",
 "operator":"AND",
 "value": [
 "PRPH2"
 ]
 },
 {
 "dataElement": "GeneVariantIndicator",
 "form": "eyeGENEGenomics",
 "repeatableGroup": "Genomics Information",
 "value": [
 "yes"
 ]
 },
 {
 "dataElement": "MedicalCondNEIEnrollTyp",
 "form": "eyeGENEDemographics",
 "repeatableGroup": "Subject Demographics",
 "value": [
 "Stargardt Disease","Retinoblastoma"
 ]
 }
 ]
}



In [None]:
genomicsfilter2 = {
 "formStudy": [
 {
 "form": "eyeGENEGenomics",
 "studies": ["EYEGENE-STUDY0000203"]
 },
 {
 "form": "eyeGENEDemographics",
 "studies": ["EYEGENE-STUDY0000203"]
 },
 ],
 "filter": [
 {
 "dataElement": "HGNCGeneSymbl",
 "form": "eyeGENEGenomics",
 "repeatableGroup": "Genomics Information",
 "operator":"OR",
 "value": [
 "ABCA4"
 ]
 },
 {
 "dataElement": "HGNCGeneSymbl",
 "form": "eyeGENEGenomics",
 "repeatableGroup": "Genomics Information",
 "operator":"AND",
 "value": [
 "PRPH2"
 ]
 },
 {
 "dataElement": "GeneVariantIndicator",
 "form": "eyeGENEGenomics",
 "repeatableGroup": "Genomics Information",
 "value": [
 "yes"
 ]
 }
 ]
}



In [None]:
query = requests.post(queryurl,headers=headers,json=genomicsfilter2)
query

In [None]:
print(f"Response: {query}")
print("Data received: " + query.headers["Content-Disposition"][21:96]) 

In [None]:
dataset = query.text
texttodf = StringIO(dataset)
nei_data = pd.read_csv(texttodf, sep=",")
nei_data.head()

In [None]:
len(nei_data)

## Return data with filter with JSON Format

In [None]:
queryurl ="https://brics.nei.nih.gov/gateway/query-api/data/json"

headers = {
 'accept': 'application/json',
 'Content-type': 'application/json',
 'Authorization':'Bearer ' + token }

In [None]:
query = requests.post(queryurl,headers=headers,json=genomicsfilter2)
query

In [None]:
print(f"Response: {query}")
print("Data received: " + query.headers["Content-Disposition"][21:96]) 

In [None]:
jsondata = query.json()
jsondata