
Introduction
Dhrumin Shah
2015-08-22
Source:vignettes/basic-usage-vignette.Rmd
basic-usage-vignette.RmdIntroduction
This package provides easy access to the API provided by Open Government Data Platform - India to download datasets from R. Here’s the list of Datasets available through API.
Basic Usage
## Welcome to ogdindiar
When calling OGD India API, at minimum, you need to provide 2 parameters.
-
res_id: Resource id of the dataset you want to access -
api_key: Your personal API key (See Package Installation instructions)
Resource id for the datasets can be found on the data specific page on the data portal. For example, this page has the resource id information for Annual And Seasonal Mean Temperature Of India. The resource is a string that’s part of Datastore API URL.
- The URL for this dataset as shown on the page is - https://data.gov.in/api/datastore/resource.json?resource_id=98fe9271-a59d-4834-b05b-fd5ddb94ac01&api-key=OGDINDIA_API_KEY
- The resource id is the highlighted part in the above URL.
The main function this package provides is fetch_data().
Once you have figured out the resource id, you can download that dataset
as follows:
mean_temp_data = fetch_data(res_id = "98fe9271-a59d-4834-b05b-fd5ddb94ac01")This function returns a list of 2 elements.
- The first element is the data
| id | timestamp | year | annual | jan_feb | mar_may | jun_sep | oct_dec |
|---|---|---|---|---|---|---|---|
| 1123 | 1424778424 | 1957 | 23 | 18 | 25 | 27 | 21 |
| 1423 | 1424778424 | 1972 | 24 | 18 | 25 | 27 | 21 |
| 1443 | 1424778424 | 1973 | 24 | 19 | 26 | 27 | 21 |
| 1463 | 1424778424 | 1974 | 24 | 18 | 26 | 27 | 21 |
| 1483 | 1424778424 | 1975 | 23 | 18 | 25 | 26 | 21 |
| 1503 | 1424778424 | 1976 | 24 | 18 | 25 | 26 | 22 |
- The second element is a dataframe containing metadata about the columns.
knitr::kable(mean_temp_data[[2]])| .id | type | size | unsigned | not null | description |
|---|---|---|---|---|---|
| id | serial | normal | TRUE | TRUE | |
| timestamp | int | normal | TRUE | FALSE | The Unix timestamp for the data. |
| year | int | normal | NA | FALSE | |
| annual | int | normal | NA | FALSE | |
| jan_feb | int | normal | NA | FALSE | |
| mar_may | int | normal | NA | FALSE | |
| jun_sep | int | normal | NA | FALSE | |
| oct_dec | int | normal | NA | FALSE |
Advanced Usage
Instead of downloading entire datasets you can conditionally download
specific data elements. This functionality is achieved using additional
arguments to fetch_data() function. Currently you can use
-
-
filterto filter the dataset using equality constraints on specific columns. -
selectto select specific set of columns to be downloaded. -
sortto sort the resulting dataset based on multiple columns.
Following example illustrates this -
mean_temp_25 = fetch_data(res_id = "98fe9271-a59d-4834-b05b-fd5ddb94ac01",
filter = c("annual" = "25"),
select = c("year", "annual", "jan_feb", "mar_may", "jun_sep", "oct_dec"),
sort = c("jan_feb" = "asc", "mar_may" = "desc")
)The returned dataset -
| year | annual | jan_feb | mar_may | jun_sep | oct_dec |
|---|---|---|---|---|---|
| 2002 | 25 | 19 | 27 | 27 | 22 |
| 2010 | 25 | 20 | 27 | 27 | 22 |
| 1995 | 25 | 20 | 26 | 28 | 23 |
| 2009 | 25 | 20 | 26 | 27 | 22 |
| 2006 | 25 | 21 | 26 | 27 | 22 |
Metadata about the returned dataset
knitr::kable(mean_temp_25[[2]])| .id | type | size | not null | description |
|---|---|---|---|---|
| year | int | normal | FALSE | |
| annual | int | normal | FALSE | |
| jan_feb | int | normal | FALSE | |
| mar_may | int | normal | FALSE | |
| jun_sep | int | normal | FALSE | |
| oct_dec | int | normal | FALSE |
There’s one more argument that is passed to fetch_data()
function, field_type_correction. The data fetch process
inadvertently treats all the columns as character. The
default setting field_type_correction = TRUE converts these
columns back to numeric type based on accompanying
metadata.