Connector

The data connector is an automated application that retrieves a file, determines its delimited format, and converts into an output datatable. It takes the following parameters:

  • data_url: a URL to the datafile (csv, tsv, psv or other)
  • delimiter: this can be supplied if required, but will be automatically determined if left blank
  • skiprows: the number of rows to skip before the header line in the datafile
  • skipcols: the number of columns to skip before the first data column

To create a data connector model, you can do the following:

  1. Go to the App Manager, and select Create Application -> Python Model, named Connector - Breast Cancer.
  2. Click the Git Clone button on the toolbar, and enter the git url: https://gitlab.com/optika-solutions/apps/xsv-http-connector.git. You can leave the username and password blank, but hit the refresh button next to the branch selector, then select the branch master. Click ok.
  3. Go to the Research Grid, and you can enter the url https://s3-ap-southeast-2.amazonaws.com/akumen-public-bucket/data.csv into data_url. All other fields can be left blank. Research Grid Research Grid
  4. Execute the scenario, and check the data_vw view in the Data tab to see the loaded data. Data View, Cancer Data Data View, Cancer Data