@@ -38,12 +38,74 @@ pak::pak("The-Strategy-Unit/azkit")
3838
3939## Usage
4040
41- _ To be added._
41+ A primary function in ` {azkit} ` enables access to an Azure blob container:
42+
43+ ``` r
44+ data_container <- azkit :: get_container()
45+
46+ ```
47+ Authentication is handled "under the hood" by the ` get_container() ` function,
48+ but if you need to, you can explicitly return an authentication token for
49+ inspection or testing:
50+
51+ ``` r
52+ my_token <- azkit :: get_auth_token()
53+
54+ ```
55+
56+ The container returned will be set by the name stored in the ` AZ_CONTAINER `
57+ environment variable, if any, by default, but you can override this by supplying
58+ a container name to the function:
59+
60+ ``` r
61+ custom_container <- azkit :: get_container(" custom" )
62+ ```
63+
64+ Return a list of all available containers in your default Azure storage with:
65+
66+ ``` r
67+ list_container_names()
68+ ```
69+
70+ Once you have access to a container, you can use one of a set of data reading
71+ functions to bring data into R from ` .parquet ` , ` .rds ` , ` .json ` or ` .csv ` files:
72+
73+ ``` r
74+ pqt_data <- azkit :: read_azure_parquet(data_container , " v_important_data" )
75+
76+ ```
77+
78+ The functions will try to match a file of the required type using the ` file `
79+ name supplied. In the case above, "v_important_data" would match a file named
80+ "v_important_data.parquet", no need to supply the file extension.
81+
82+ By default the ` read_* ` functions will look in the root folder of the container.
83+ To specify a subfolder, supply this to the ` path ` argument.
84+ The functions will _ not_ search recursively into further subfolders, so the path
85+ needs to be full and accurate.
86+
87+ If there is more than 1 file matching the string supplied to ` file ` argument,
88+ the functions will throw an error.
89+ Specifying the exact filename will avoid this of course - but shorter ` file `
90+ arguments may be convenient in some situations.
91+
92+ Currently these functions only read in a single file at a time.
93+
94+ Setting the ` info ` argument to ` TRUE ` will enable the functions to give some
95+ confirmatory feedback on what file is being read in.
96+ You can also pass through arguments to for example ` readr::read_csv() ` :
97+
98+ ``` r
99+ csv_data <- data_container | >
100+ azkit :: read_azure_csv(" vital_data.csv" , path = " data" , col_types = " ccci" )
101+
102+ ```
42103
43104## Environment variables
44105
45- To access Azure Storage you need to add some variables to a
46- [ ` .Renviron ` file] [ posit_env ] in your project.
106+ To access Azure Storage you will want to set some environment variables.
107+ The neatest way to do this is to include a [ ` .Renviron ` file] [ posit_env ] in
108+ your project folder.
47109
48110⚠️These values are sensitive and should not be exposed to anyone outside The
49111Strategy Unit.
@@ -54,12 +116,24 @@ Your `.Renviron` file should contain the variables below.
54116Ask a member of [ the Data Science team] [ suds ] for the necessary values.
55117
56118```
119+ # essential
57120AZ_STORAGE_EP=
58- AZ_STORAGE_CONTAINER=
121+ # useful but not absolutely essential:
122+ AZ_CONTAINER=
123+
124+ # optional, for certain authentication scenarios:
125+ AZ_TENANT_ID=
126+ AZ_CLIENT_ID=
127+ AZ_APP_SECRET=
59128```
60129
61130These may vary depending on the specific container you’re connecting to.
62131
132+ For one project you might want to set the default container (` AZ_CONTAINER ` ) to
133+ one value, but for a different project you might be mainly working with a
134+ different container so it would make sense to set the values within the
135+ ` .Renviron ` file for each project, rather than globally for your account.
136+
63137## Getting help
64138
65139Please use the [ Issues] [ issues ] feature on GitHub to report any bugs, ideas
0 commit comments