Some basic concepts
When we talk about data sources, we must consider the following:
Type or frequency of update
Data sources can be classified according to their type or update frequency.
- Static: data that would not be updated. For example, consolidated historical data from the previous year or a list of values of regions and identifiers.
- Incremental: data that is not updated, but new data is added, generally with a known frequency. For example, every day a new data is generated and must be available alongside the data in the preceding days.
- Dynamic: data that is frequently modified and which their value is to stay up to date. For example, the value of a daily financial indicator or GPS position of public transport.
Data sources can be classified according to their format and type of origin.
- Plain text file (.txt)
- Comma-separated file (.csv)
- Circle sheet file (.xlsx)
- Files hosted in an endpoint (HTTP(s), FTP)
- SOAP/XML web service
- REST/JSON web service
- Relational databases
- Documentary databases
Type of operation
Data sources can be classified according to the action or type of operation:
- Write (create, modify, delete)
- Read
Types of access to data sources
Depending on the type of data source, specific access may be required.
For example, a web service may require different types of authentication, or it may be necessary to add rules in the firewall to allow connections from the platform.
How the data updates will depend on their type and origin. For example, if the source is a plain text file, static or incremental, it can be updated manually or by other automated processes.
Frequency of updating of data
The frequency of data
updating is a central aspect to consider in APIs Management. The frequency should be defined in units of time (every minute, one hour, one day) or even with tasks specific updates known as cron.
In addition, cache management through defining the frequency of data source updating would make it possible to impact that source only when necessary. For example, if the APIs management platform consults a web service that updates your data every hour, the defined frequency must be one hour.
This way, it would consult the web service origin only once an hour, and the answer would remain in cache during that period. All queries you receive during that interval will be delivered with cache.
The following formats and origins would allow you to create data views:
Files
- Open format: CSV, TSV.
- Text format: TXT
- Excel Format: XLSX
- OpenOffice Format: SDGs
Web Services
- Rest/json
- Soap/xml
Databases
- Elastic Search
- Mongo DB
- MySQL
- SQLServer
- PostgreSQL
- Oracle DB
- Keyhole Markup Language (KML, KMZ)