Docker Archive

Warning

The Docker Archive documentation is deprecated

The old workflow described in these docs will not be supported with new Lightly Worker versions above 2.6. Please switch to our new documentation page instead.

We all know that sometimes when working with ML data we deal with really BIG datasets. The cloud solution is great for exploration, prototyping and an easy way to work with Lightly. But there is more!

Alt text

With the introduction of our on-premise solution, you can process larger datasets completely on your end without data leaving your infrastructure. We worked hard to make this happen and are very proud to present you with the following specs:

  • NEW Using the Docker with a Cloud Bucket as Remote Datasource

  • NEW Trigger a Docker Job from from the Platform or code

  • Active Learning using Lightly Docker

  • Automatically upload the selected dataset to the Lightly Platform (see Upload Sampled Dataset To Lightly Platform)

  • See your docker runs live in the Lightly Platform (see Live View of Docker Status)

  • Lightly Docker has built-in pretagging models (see Pretagging)

    • Use this feature to pre-label your dataset or to only select images which contain certain objects

    • Supported object categories are: bicycle, bus, car, motorcycle, person, train, truck

  • Select from more than 1 Million samples within a few hours!

  • Runs directly with videos without prior extraction of the frames!

  • Wrapped in a docker container (no setup required if your system supports docker)

  • Configurable

    • Use stopping conditions for the selection strategy such as minimum distance between two samples

    • Use various selection strategies

    • Check for corrupt files and report them

    • Check for exact duplicates and report them

    • We expose the full Lightly SSL framework config

  • Automated reporting of the datasets for each run

    • PDF report with histograms, plots, statistics, and much more …

  • Hand-optimized code (to instruction-level)

    • Multithreaded

    • SIMD instructions

  • Minimal hardware requirements:

    • 1 CPU core

    • 4 GB free RAM

  • Recommended hardware: