Using Zindi data on Google Colab

Using Zindi data on Google Colab

If you use Google Colab to train your machine learning models for Zindi competitions, you'll have realized that it's very difficult and inefficient to download the training and testing data and then upload the same data to Colab or Google drive. In this article, you will learn how to use Zindi competition/hackathon data directly on Google Colab without having to download it to your machine.

  1. Go to the Zindi competition you are in, click on Data
  2. Right-click on your browser window select Inspect from the list of options (or the shortcut Ctrl + Shift + i on Google Chrome)
  3. Click on Network Tab
  4. Go back to the main browser windows, then click on the data you want to download, let the download start and then cancel it.
  5. In the Network Tab, usually under the Name heading, you will see the name of your file.
  6. Click on the file name you will see something like:

    Request Method: GET Status Code: 200 OK

  7. Right-click on the filename, go to copy and choose Copy as CuRL

Screenshot from 2020-12-18 15-13-04.png

Paste the copied URL to your Colab notebook like this

![paste the copied URL] -o file.zip

Note:

  • The '-o' (output) argument allows you to specify the name of the file you are downloading
  • Give the file you are downloading the same format as the one on the Zindi platform

    i.e if you are downloading Test.zip from Zindi, your code should look like this -o Test.zip or -o submission.csv for submission.csv file

  • Finally use !unzip Test.zip command to unzip the contents

Screenshot from 2020-12-18 23-37-49.png

The downloaded data to can be saved to Google Drive by mounting your drive in Google Colab and copying the data into the drive. This saves you from having to download the data from Zindi again, you only need to mount your drive and point to the data location.

Thanks to Steveoni

Note the download size while using this process is not equivalent to the amount of data being used. If the file size is 77 GB, to download using this method does not actually use up to 77 GB of your data.