Data Management

All Your Visual Data. One Place.

Aggregate, organize, and explore billions of images and videos from any source. One unified repository for all your computer vision data.

Connect:
AWS S3
Google Cloud
Azure
+
Architecture

How it works under the hood

Connects to S3, GCP, or Azure. Ingests any image or video format. Indexes everything so you can query it later.

Sources
AWS S3AWS S3
GCPGCP
AzureAzure
DATALAKE
2.4M
assets indexed
847GB
storage
12ms
latency
Outputs
Datasets24
Experiments156
Deployments8
upload.py
from picsellia import Client

client = Client()
datalake = client.get_datalake()

# Upload with metadata
datalake.upload_data(
  filepaths="./images/*.jpg",
  tags=["production", "batch-42"],
  metadata={"reference": "factory-A"}
)

# Query with filters
data = datalake.list_data(
  tags=["production"]
)
Python SDK v6.9.0
Auto EXIF extractionBatch upload
query.py
# Query with tags
data = datalake.list_data(
  tags=["defects"]
)
# ✓ 2,847 results

# Query with custom_metadata filter
data = datalake.list_data(
  custom_metadata={"location": "factory-A"}
)
# ✓ 1,245 results

# Combine tags and dimensions
data = datalake.list_data(
  tags=["production", "validated"],
  limit=1000
)
Python SDK
tagsmetadatafilters

Image & Video Format Support

Ingest standard visual data formats

.jpg
image
.png
image
.tiff
image
.webp
image
.bmp
image
.gif
image
.mp4
video
.mov
video

Processing Pipeline

Embeddings generation & database indexing

Live
Embedding Generation156 vec/sec
DB Indexing12 ms/img
Ingestion Rate2,847 img/min
Storage Sync99.9 %
Python SDK

Powerful Data Querying

Query your datalake programmatically with the Python SDK. Filter by tags, metadata, and more with full type hints and auto-completion.

list_data() PARAMS
tagsList[str]
custom_metadataDict[str, Any]
limitint
offsetint
order_bystr
TAG OPERATIONS
add_tags()add to data
remove_tags()remove from data
list_tags()get all tags
create_tag()create new tag
FILTERABLE
tagsDataTags
custom_metadatacustom fields
filenameasset name
created_attimestamps
typeimage/video
advanced_query.py
auto-completetype hints
# Advanced data query
data = datalake.list_data(
  # Filter by tags
  tags=[
    "production",
    "validated"
  ],
  # Filter by custom_metadata
  custom_metadata={
    "location": "factory-A"
  },
  limit=1000
)

for item in data:
  print(item.filename)
EXECUTION
2,847results
23msquery time
847MBscanned
MATCHED TAGS
production (1,892)validated (2,103)factory-A (1,245)factory-B (892)
Ready to create dataset
Visual Search

Find similar images instantly

OpenCLIP embeddings turn your images into vectors. Search by similarity, cluster by content, and spot outliers without writing a single query.

ViT-B/16
Default Model
512-dim
Vector Size
QDrant
Vector DB
<10ms
Search Latency
Embeddings Viewer
UMAPDBSCAN

Similarity Search

Image → Images
IMG_4521.jpg
cosine similarity > 0.85
847
matches

Text-to-Image Search

Text → Images
"damaged surface with rust"
CLIP text encoder156 results • 8ms

Anomaly Detection

Isolation Forest
contamination: 0.01
23
corrupted
89
outliers

Fine-tune Your Own CLIP Model

Generic embeddings not cutting it? Fine-tune a CLIP model on your own data. Search and clustering get much better when the model knows your domain.

+40%
Better accuracy
Learn more
Organization

DataTags & Metadata Schema

Multi-dimensional organization with flexible tagging and comprehensive metadata support. Structure your data without moving files.

DATATAGS SYSTEMorganization tags
AVAILABLE TAGS
factory-A(1,245)
factory-B(892)
production(1,892)
training(3,456)
edge-case(234)
validated(2,103)
inspection_042.tiff
4032x3024 - 12.4MB
factory-AproductionvalidatedQ1-2024
METADATA FIELDSSDK v6.9.0+ auto-EXIF
{
  // Location & Acquisition
  "latitude": 48.8566,
  "longitude": 2.3522,
  "altitude": 35.2,
  "acquired_at": "2024-03-15T14:32:00Z",
  "acquired_by": "drone-unit-7",
  "weather": "clear, 18C",

  // Camera & Sensor
  "focal_length": 24.0,
  "sensor_width": 36.0,
  "manufacturer": "DJI",
  "yaw": 127.5,
  "pitch": -45.0,
  "roll": 0.0,

  // Reference Fields
  "reference": "INS-2024-0042",
  "custom_id": "B-789"
}
Auto-extracted from EXIF with fill_metadata=True
EXIFGPS

Ready to centralize your data?

Connect your storage, upload your data, and start querying. Free trial, no credit card.