{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "BRICS QUERY TOOL APPLICATION PROGRAMMING INTERFACE" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Purpose: \n", "### The purpose of this notebook is to provide examples of the different endpoints for the Query Tool API using Python. \n", "### For additional questions, please refer to the API User Guide.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Instructions:\n", " 1. Import the Python Dependencies. \n", " 2. Authenticate user permissions and retrieve access token.\n", " 3. Create and send a query for the specific endpoint.\n", " - If the response 200, it means that the query was submitted successfully. For more information about the status codes please refer to this link: https://httpstatuses.com/\n", " 4. Save the data in a dataframe or in folder to perform analysis. " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#Import Python Dependencies\n", "import pandas as pd\n", "import requests\n", "import json\n", "import getpass\n", "from io import StringIO\n", "import os\n", "import datetime as dt\n", "import time\n", "import sys" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#create your folder for storing data\n", "def create_folder(folder_path):\n", " adjusted_folder_path = folder_path\n", " folder_found = os.path.isdir(adjusted_folder_path)\n", " counter = 0\n", " while folder_found == True:\n", " counter = counter + 1\n", " adjusted_folder_path = folder_path + ' (' + str(counter) + ')'\n", " folder_found = os.path.isdir(adjusted_folder_path)\n", " os.mkdir(adjusted_folder_path)\n", " return adjusted_folder_path\n", "\n", "x=dt.datetime.now()\n", "new_dir = os.getcwd()+'\\\\'+\"QueryDataFiles_\"+x.strftime('%Y_%m_%d')\n", "created_dir = create_folder(new_dir)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## AUTHENTICATION\n", "### To log in the Query Tool API the user needs to log in and retrieve the access token that is used for subsequent endpoints. This service will authenticate a user's permission to use BRICS Query Tool API.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "UserPassword = getpass.getpass(\"Enter your password\")\n", "UserUsername = input(\"Enter your username\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#login in to API \n", "url = \"https://brics.nei.nih.gov/gateway/authentication/user/login\"\n", "headers = {\n", " 'accept': 'text/plain',\n", " 'Content-Type': 'application/x-www-form-urlencoded'\n", "}\n", "\n", "data = {'password':UserPassword,\n", " 'username': UserUsername}" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "response = requests.post(url, headers=headers, data=data)\n", "\n", "#login in check \n", "if response.status_code == 200:\n", " print(\"Login Successful\")\n", " token=response.text\n", " print(f'Here is your token: {token}')\n", "elif response.status_code != 200:\n", " print(response.status_code)\n", " print(\"Login not Successful. Please check username and password. If error still occurs reach out to system adminstrator. THIS CODE WILL NOT PROCEED\")\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Study API\n", "\n", "## The Study APIs return the following information:\n", "1. Studies with data that the user has access to\n", "2. Information about the a study or studies.\n", "3. Studies associated with a form structure" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## GET ALL STUDY INFORMATION\n", "### This service will return all the studies that have data in the instance. Optional it will return information for a study with the Prefix ID (Study ID) provided. \n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#get study prefix from title\n", "url =\"https://brics.nei.nih.gov/gateway/query-api/study\"\n", "\n", "headers = {\n", " 'accept': 'application/json',\n", " 'Content-type': 'application/json',\n", " 'Authorization':'Bearer ' + token \n", "}\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "query = requests.get(url,headers =headers)\n", "query\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Output: JSON\n", "The user will receive the following information in a JSON format:\n", "1. Study Abstract\n", "2. Study Status\n", "3. Study Prefix ID\n", "4. Study Title\n", "5. Principal Investigator" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "output = query.json()\n", "output" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Study Information for one study. Example NEI BRICS-STUDY0000205" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#get study prefix from title\n", "url =\"https://brics.nei.nih.gov/gateway/query-api/study?prefixedId=\"\n", "header = {\n", " 'accept': 'application/json',\n", " 'Content-type': 'application/json',\n", " 'Authorization':'Bearer ' + token \n", "}\n", "\n", "StudyPrefixID = input(\"Enter Study Prefix\")\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "query = requests.get(url + StudyPrefixID,headers = header)\n", "query" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "studyinfo = query.json()\n", "studyinfo" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## GET FORM STRUCTURES FOR A STUDY\n", "### Returns all the form structures that have data submitted for the study. \n", "#### Example NEI BRICS-STUDY0000205" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "url = \"https://brics.nei.nih.gov/gateway/query-api/form/study?prefixedId=\"\n", "\n", "headers = {\n", " 'accept': 'application/json',\n", " 'Content-type': 'application/json',\n", " 'Authorization':'Bearer ' + token \n", "}\n", "studyid = input(\"Enter Study PrefixID\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "query= requests.get(url + studyid, headers = headers)\n", "query" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Output Format: JSON \n", "The user will receive the following information in JSON format\n", "1. Study ID\n", "2. Forms associated with the Study\n", " - Form Structure Short Name\n", " - Form Structure title\n", " - Form Structure Version" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "studyformstructuredata = query.json()\n", "studyformstructuredata\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## GET ALL STUDIES ASSOCIATED WITH A FORM STRUCTURE\n", "Returns all the studies that have data submitted to the form structure\n", "Example eyeGENEDemographics" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#get list of form structures\n", "\n", "url = \"https://brics.nei.nih.gov/gateway/query-api/study/form?formName=\"\n", "header = {\n", " 'accept': 'application/json',\n", " 'Content-type': 'application/json',\n", " 'Authorization':'Bearer ' + token \n", "}\n", "\n", "formstructureshortname = input()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "query = requests.get(url + formstructureshortname,headers=header)\n", "query" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Output Format: JSON\n", "The user will receive the following information in JSON format:\n", "1. Study Status\n", "2. Study ID\n", "3. Study Title\n", "4. Study Abstract" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "formstructureinformation = query.json()\n", "formstructureinformation" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## GET DATA ELEMENTS FOR A FORM STRUCTURE\n", "Return all data elements associated with the form structure. Example: eyeGENEDemographics" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "url = \"https://brics.nei.nih.gov/gateway/query-api/dataElement/form/\"\n", "\n", "headers = {\n", "# 'accept': 'application/json',\n", " 'Content-type': 'application/json',\n", " 'Authorization':'Bearer ' + token \n", "}\n", "\n", "# print(\"Input Form Structure Short Name\")\n", "formstructureshortname = input()\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "dataelementapiquery = requests.get(url + formstructureshortname,headers = headers)\n", "dataelementapiquery" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Output Format: JSON\n", "The user will receive the following information from the JSON.\n", "1. Repeatable Group Name\n", "2. Position in the Form Structure\n", "3. Data Element ID\n", "4. Name and Title of Data Element\n", "5. Description of Data Element\n", "6. Data Type\n", "7. Input Restriction \n", "8. Required Type" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "dataelementapiinformation = dataelementapiquery.json()\n", "dataelementapiinformation" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## GET DATA FROM MULTIPLE FORM STRUCTURES WITHOUT DOING JOINS" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "multipleformsheader = {\n", " 'Content-type': 'application/json',\n", " 'Authorization':'Bearer ' + token\n", " }\n", "\n", "multipleformsurl = \"https://brics.nei.nih.gov/gateway/query-api/data/bulk/form/study\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "multipleformsfilter ={\n", " \"flattened\": \"false\",\n", " \"formStudies\": [\n", " {\n", " \"form\": \"eyeGENEDemographics\",\n", " \"studies\": [\n", " \"EYEGENE-STUDY0000203\"\n", " ]\n", " },\n", " {\n", " \"form\": \"eyeGENE_Solved\",\n", " \"studies\": [\n", " \"EYEGENE-STUDY0000203\"\n", " ]\n", " }\n", " ] \n", " ,\n", " \"outputFormat\": \"csv\"\n", "}" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "multipleformsquery = requests.post(multipleformsurl,headers = multipleformsheader,json = multipleformsfilter)\n", "multipleformsquery" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "multipleformsquery.headers" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Ouput Format: Zipped CSV Files\n", "\n", "User will recieve the individual csv files" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#save the zipfile\n", "import zipfile, io\n", "z = zipfile.ZipFile(io.BytesIO(multipleformsquery.content))\n", "z.extractall(created_dir)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#file information\n", "import glob\n", "files = sorted(glob.glob(created_dir + '/*.csv'))\n", "\n", "for file in files:\n", " print(\"Here is the location of your files: \" + file)\n", " print(\"_____________________________________________\")\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Return multiple form structures for a study without a join" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "multipleformsstudy = {\n", " 'accept':'application/zip',\n", " 'Content-type': 'application/json',\n", " 'Authorization':'Bearer ' + token\n", " }\n", "\n", "multiplformsstudyurl =\"https://brics.nei.nih.gov/gateway/query-api/data/bulk/study/form\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "multipleformsstudyfilter = {\n", " \"flattened\": \"false\",\n", " \"outputFormat\": \"csv\",\n", " \"studyForms\": [\n", " {\n", " \"forms\": [\n", " \"eyeGENE_Clinical\"\n", " ],\n", " \"study\": \"NEI_BRICS-STUDY0000207\"\n", " },\n", " {\n", " \"forms\": [\n", " \"PROWL_Demo_2\"\n", " ],\n", " \"study\": \"EYEGENE-STUDY0000204\"\n", " } \n", " ]\n", "}" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "genomicsfilter = {\n", " \"flattened\": \"false\",\n", " \"outputFormat\": \"csv\",\n", " \"studyForms\": [\n", " {\n", " \"forms\": [\"eyeGENEGenomics\",\"eyeGENEDemographics\"],\n", " \"study\": \"EYEGENE-STUDY0000203\"\n", " }\n", " ]\n", " \n", "}" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "multipleformsstudyquery = requests.post(multiplformsstudyurl,headers = multipleformsstudy,json = multipleformsstudyfilter)\n", "multipleformsstudyquery " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "multipleformsstudyquery.headers" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "created_dir = create_folder(new_dir)\n", "a = zipfile.ZipFile(io.BytesIO(multipleformsstudyquery.content))\n", "a.extractall(created_dir)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#file information\n", "import glob\n", "files = sorted(glob.glob(created_dir + '/*.csv'))\n", "\n", "for file in files:\n", " print(\"Here is the location of your files: \" + file)\n", " print(\"_____________________________________________\")\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Return data with a filter\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "queryurl =\"https://brics.nei.nih.gov/gateway/query-api/data/csv\"\n", "\n", "headers = {\n", " 'accept': 'application/csv',\n", " 'Content-type': 'application/json',\n", " 'Authorization':'Bearer ' + token }" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "genomicsfilter1 = {\n", " \"formStudy\": [\n", " {\n", " \"form\": \"eyeGENEGenomics\",\n", " \"studies\": [\"EYEGENE-STUDY0000203\"]\n", " },\n", " {\n", " \"form\": \"eyeGENEDemographics\",\n", " \"studies\": [\"EYEGENE-STUDY0000203\"]\n", " },\n", " ],\n", " \"filter\": [\n", " {\n", " \"dataElement\": \"HGNCGeneSymbl\",\n", " \"form\": \"eyeGENEGenomics\",\n", " \"repeatableGroup\": \"Genomics Information\",\n", " \"operator\":\"OR\",\n", " \"value\": [\n", " \"ABCA4\"\n", " ]\n", " },\n", " {\n", " \"dataElement\": \"HGNCGeneSymbl\",\n", " \"form\": \"eyeGENEGenomics\",\n", " \"repeatableGroup\": \"Genomics Information\",\n", " \"operator\":\"AND\",\n", " \"value\": [\n", " \"PRPH2\"\n", " ]\n", " },\n", " {\n", " \"dataElement\": \"GeneVariantIndicator\",\n", " \"form\": \"eyeGENEGenomics\",\n", " \"repeatableGroup\": \"Genomics Information\",\n", " \"value\": [\n", " \"yes\"\n", " ]\n", " },\n", " {\n", " \"dataElement\": \"MedicalCondNEIEnrollTyp\",\n", " \"form\": \"eyeGENEDemographics\",\n", " \"repeatableGroup\": \"Subject Demographics\",\n", " \"value\": [\n", " \"Stargardt Disease\",\"Retinoblastoma\"\n", " ]\n", " }\n", " ]\n", "}\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "genomicsfilter2 = {\n", " \"formStudy\": [\n", " {\n", " \"form\": \"eyeGENEGenomics\",\n", " \"studies\": [\"EYEGENE-STUDY0000203\"]\n", " },\n", " {\n", " \"form\": \"eyeGENEDemographics\",\n", " \"studies\": [\"EYEGENE-STUDY0000203\"]\n", " },\n", " ],\n", " \"filter\": [\n", " {\n", " \"dataElement\": \"HGNCGeneSymbl\",\n", " \"form\": \"eyeGENEGenomics\",\n", " \"repeatableGroup\": \"Genomics Information\",\n", " \"operator\":\"OR\",\n", " \"value\": [\n", " \"ABCA4\"\n", " ]\n", " },\n", " {\n", " \"dataElement\": \"HGNCGeneSymbl\",\n", " \"form\": \"eyeGENEGenomics\",\n", " \"repeatableGroup\": \"Genomics Information\",\n", " \"operator\":\"AND\",\n", " \"value\": [\n", " \"PRPH2\"\n", " ]\n", " },\n", " {\n", " \"dataElement\": \"GeneVariantIndicator\",\n", " \"form\": \"eyeGENEGenomics\",\n", " \"repeatableGroup\": \"Genomics Information\",\n", " \"value\": [\n", " \"yes\"\n", " ]\n", " }\n", " ]\n", "}\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "query = requests.post(queryurl,headers=headers,json=genomicsfilter2)\n", "query" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(f\"Response: {query}\")\n", "print(\"Data received: \" + query.headers[\"Content-Disposition\"][21:96]) " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "dataset = query.text\n", "texttodf = StringIO(dataset)\n", "nei_data = pd.read_csv(texttodf, sep=\",\")\n", "nei_data.head()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "len(nei_data)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Return data with filter with JSON Format" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "queryurl =\"https://brics.nei.nih.gov/gateway/query-api/data/json\"\n", "\n", "headers = {\n", " 'accept': 'application/json',\n", " 'Content-type': 'application/json',\n", " 'Authorization':'Bearer ' + token }" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "query = requests.post(queryurl,headers=headers,json=genomicsfilter2)\n", "query" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(f\"Response: {query}\")\n", "print(\"Data received: \" + query.headers[\"Content-Disposition\"][21:96]) " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "jsondata = query.json()\n", "jsondata" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.4" } }, "nbformat": 4, "nbformat_minor": 4 }