[Experimental] This function performs a search on the school directory at uiv.cz and returns the resulting export - either the XLS file or the data, or both. The school directory is a version of the school register: unlike the core register, it contains contact information but lacks some other information (such as unique address identification.) Use vz_get_register() for the core register.

vz_get_directory(
  tables = c("addresses", "schools", "locations", "specialisations"),
  ...,
  return_tibbles = FALSE,
  write_files = TRUE,
  dest_dir = getwd()
)

Arguments

tables

a character vector of tables to retrieve. See ** Tables** below.

...

key-value pairs of search fields. Use vz_get_search_fields() to see a list of fields and their potential values.

return_tibbles

Whether to return the data (if TRUE) or only download the files (if FALSE).

write_files

Whether to write the XLS files locally.

dest_dir

Directory in which to write XLS files. Defaults to working directory.

Value

A list of a tibbles if return_tibbles = TRUE, a single tibble if only one table name is passed tables, otherwise a character vector of paths to the downloaded *.xls files.

if return_tibbles is TRUE, a named list of tibbles, with a tibble for each table in tables with the corresponding name, unless the function was called with a tables parameter of length one, in which case the result is a tibble; if return_tibbles is FALSE, the result is a character vector of file paths. Note that the downloaded XLS files are in fact HTML files and you are best off loading them using vz_load_directory() and tidying with vz_load_directory, though they can be opened in Excel too.

Tables

Tables can include "addresses", "schools", "locations", "specialisations". If you need more tables based on the same query (fields), pass them into a single function call in order to avoid burdening the data provider's server (the server needs to perform a search for each function call; there is no caching and no data dumps are made available).

What this does

The function

  • performs a search on the school directory at uiv.cz

  • by default the search is for all schools, unless ... params are set to narrow down the search

  • traverses the results to the export links

  • downloads the XLS files

  • loads them into tibbles if return_tibbles is TRUE

This is the only way to get to the data - there are no static dumps available. At the same time, no intense web scraping takes place - only individual export files (max 4 per call) are downloaded the same way as it would be done manually.

Note

To avoid blitzing the data provider's server with many heavy requests:

  1. If you need more tables based on the same search, pass it in one call, using the tables argument. This means that only one initial search is peformed.

  2. Only ask for the tables you need.

  3. If you need a subset of the data, use the fields (...) argument

  4. If you need multiple subsets of the data, try to do that via the fields (...) argument too, though that may not always be possible.

  5. If you are downloading a large dump and reusing it in a pipeline, keep the downloaded XLS files (or your own export) locally (setting write_files to TRUE), use caching and avoid calling this function repeatedly (ideally make any reruns conditional on the age of the stored export or use a pipeline management framework such as targets.

Examples

vz_get_directory("addresses", uzemi = "CZ010", return_tibbles = TRUE, write_files = TRUE)
#> Downloaded 533.56 kB
#> # A tibble: 966 x 27 #> red_izo ico zrizovatel uzemi kraj spravni_urad orp plny_nazev #> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> #> 1 6000002… 49625… 6 CZ01… Hlavn… B11000 1101 "Mateřská škola s… #> 2 6000002… 61379… 6 CZ01… Hlavn… B11000 1112 "Církevní mateřsk… #> 3 6000002… 25765… 5 CZ01… Hlavn… B11000 1112 "Modrý klíč - zák… #> 4 6000002… 60437… 6 CZ01… Hlavn… B11000 1113 "Církevní mateřsk… #> 5 6000002… 25143… 5 CZ01… Hlavn… B11000 1113 "Bilingvální mate… #> 6 6000002… 25637… 5 CZ01… Hlavn… B11000 1116 "Soukromá mateřsk… #> 7 6000002… 25642… 5 CZ01… Hlavn… B11000 1113 "Soukromá mateřsk… #> 8 6000002… 49625… 6 CZ01… Hlavn… B11000 1107 "Katolická mateřs… #> 9 6000003… 60447… 6 CZ01… Hlavn… B11000 1108 "Církevní mateřsk… #> 10 6000003… 61507… 5 CZ01… Hlavn… B11000 1109 "Mateřská škola -… #> # … with 956 more rows, and 19 more variables: zkraceny_nazev <chr>, #> # ulice <chr>, c_p <chr>, c_or <chr>, c_obce <chr>, psc <chr>, misto <chr>, #> # telefon <chr>, fax <chr>, email_1 <chr>, email_2 <chr>, www <chr>, #> # id_dat_schranky_subjektu <chr>, reditel <chr>, x <chr>, je_ovm <chr>, #> # zuj <chr>, email_zrizovatele <chr>, id_dat_schranky_zrizovatele <chr>