What are data sources

What are data sources

Summary

Data sources are the starting point for the data flow, from which data views and APIs methods are created.

Below you can find data sources characteristics.



Some basic concepts

When we talk about data sources, we must consider the following:

Type or frequency of update

Data sources can be classified according to their type or update frequency.
  1. Static: data that would not be updated. For example, consolidated historical data from the previous year or a list of values of regions and identifiers.

  2. Incremental: data that is not updated, but new data is added, generally with a known frequency. For example, every day a new data is generated and must be available alongside the data in the preceding days.

  3. Dynamic: data that is frequently modified and which their value is to stay up to date. For example, the value of a daily financial indicator or GPS position of public transport.



Format and type of origin

Data sources can be classified according to their format and type of origin.

  1. Plain text file (.txt)
  2. Comma-separated file (.csv)
  3. Circle sheet file (.xlsx)
  4. Files hosted in an endpoint (HTTP(s), FTP)
  5. SOAP/XML web service
  6. REST/JSON web service
  7. Relational databases
  8. Documentary databases

Type of operation

Data sources can be classified according to the action or type of operation:
  1. Write (create, modify, delete)
  2. Read

Types of access to data sources

Depending on the type of data source, specific access may be required.

For example, a web service may require different types of authentication, or it may be necessary to add rules in the firewall to allow connections from the platform.

Forms of updating data

How the data updates will depend on their type and origin. For example, if the source is a plain text file, static or incremental, it can be updated manually or by other automated processes.

Frequency of updating of data

The frequency of data updating is a central aspect to consider in APIs Management. The frequency should be defined in units of time (every minute, one hour, one day) or even with tasks specific updates known as cron.

The update frequency is a decision that impacts the management of cache. Therefore, it is essential to define a frequency according to the updating of the data source to optimize times and avoid network latency.

In addition, cache management through defining the frequency of data source updating would make it possible to impact that source only when necessary. For example, if the APIs management platform consults a web service that updates your data every hour, the defined frequency must be one hour.
This way, it would consult the web service origin only once an hour, and the answer would remain in cache during that period. All queries you receive during that interval will be delivered with cache. 


What data format is supported by Vor-Tex?

The following formats and origins would allow you to create data views:

Files

  1. Open format: CSV, TSV.
  2. Text format: TXT
  3. Excel Format: XLSX
  4. OpenOffice Format: SDGs

Web Services

  1. Rest/json
  2. Soap/xml

Databases

  1. Elastic Search
  2. Mongo DB
  3. MySQL
  4. SQLServer
  5. PostgreSQL
  6. Oracle DB

Web
  1. HTML tables

Topological data

  1. Keyhole Markup Language (KML, KMZ)


    • Related Articles

    • Data sources from URLs

      Summary The URL connector allows you to collect data from HTTP(s) and FTP protocols. To collect data from a file hosted in a URL, you must go to Data sources > From URLs Configuration Enter a URL with a clear link from where you want to collect. ...
    • Data sources from files

      Summary The Platform provides the possibility of using files as the Origin of data, upload it to the platform and then create data views and expose them in methods of your API. To upload a file, you must go to Data Sources > From File. Configuration ...
    • Data sources from REST/json web services

      Summary The REST/JSON web services connector offers all the capabilities needed to connect to such sources. This option allows you to collect data from REST/JSON web services or configure writing actions, and then create data views and exhibit them ...
    • What is a data view

      Summary The platform does not require an off-line ETL process to extract data from the origins, but associated with the view, find a set of rules that the data engine interprets for check the source on demand or periodically, extract the data updated ...
    • Data sources from SOAP/xml web services

      Summary The SOAP/XML web services connector offers all the necessary capabilities to connect to these types of sources. This option allows collecting data from SOAP/XML web services or configuring writing actions, for then creating data views and ...