Rights‑Cleared Human Behaviour Datafor High‑Accuracy AI Models

Fresh, labeled, and fully consented datasets—delivered as Parquet, JSON, or streaming API—ready to boost your next LLM checkpoint or vertical model.

See Dataset Menu ↓
User Data
Consented & Secure
Cookierem
Processing & Licensing
AI Models
High Performance

Why AI Teams Choose Cookierem

Provenance Guaranteed

Every row is traceable to an on‑chain cookie NFT—no copyright risk, no scraped junk.

Revocable by Design

If a user pulls consent, we ping your webhook so you can purge within hours—EU AI AI Act ready.

Fine‑Grained Labels

App sessions, GPS trails, MCC spend codes, heart‑rate curves—already anonymised & timestampped.

API or Bulk

Firehose stream for RLHF or nightly Parquet drops for batch training—your call.

Dataset Bundles

Pick a bundle or mix‑and‑match. All identifiers are stripped; device‑hash retained for sequence learning.

Chocolate‑Chip

App & web behaviour

  • Rows/day: 2.3B events
  • Format: Parquet + ISO timestamps
  • Use: RLHF sequences, recommenders

Oatmeal‑Raisin

Location & mobility

  • Rows/day: 780M pings
  • Format: GeoJSON (10‑min tiles)
  • Use: smart‑city, EV routing ML

Fortune Cookie

Anonymous spend graph

  • Rows/day: 95M txns
  • Format: CSV (MCC + amount)
  • Use: commerce GPT, fraud

Gluten‑Free Protein

Biometrics & fitness

  • Rows/day: 12M users
  • Format: FHIR JSON / Arrow tensors
  • Use: vital sign predictors

Gingerbread

Environmental context

  • Sensors: Wi‑Fi, sound, light
  • Format: Sensor‑ML
  • Use: on‑device assistants

Custom‑Decorated

Zero‑party survey labels

  • Rows/week: 3M responses
  • Format: JSONL (Q&A)
  • Use: alignment tuning

Flexible Data Packages That Grow With You

Only pay for what you need with our customizable options. Scale up or down as your AI project evolves.

Starter

$0.05-$1per 1k rows
Perfect for initial testing and proof of concepts
  • One-time bulk download access
  • 1-hour data granularity
  • Non-exclusive license
  • 1% sample dataset available free
POPULAR

Growth

$1-$5per 1k API calls
Ideal for teams scaling their AI models
  • Streaming API access for real-time training
  • 10-minute data granularity
  • 6-month exclusive embargo option
  • 5% data sampling with linear pricing

Enterprise

Custompricing
For large-scale AI deployments and research
  • Full API + bulk download options
  • 1-minute data granularity available
  • Full exclusivity options available
  • Royalty share options for startups

Customize Your Data Package

Access Type

Download or API streaming

Granularity

Hourly to minute-by-minute

Exclusivity

Shared to fully exclusive

Sampling

1% to 100% of data

Royalty Options

For cash-light startups

All packages include our standard 10% marketplace fee. Premium and exclusive tiers at 15%.

Request Custom Quote

Built for Trust & Regulation

Regulation now requires provable, rights‑cleared training data

EU AI Act — Article 10. Any "high‑risk" AI system shipped in Europe must be trained on "relevant, representative, error‑free and complete" datasets whose collection process and licence chain can be audited. Data must also be revocable if a subject withdraws consent.

Artificial Intelligence Act EU

Similar language survives in the pre‑final consolidated text, confirming that providers will have to document provenance or face market‑surveillance penalties.

White & Case

How Cookierem Ensures Your Compliance

  • EU Data Act & "Fair Comp"

    Licence JSON contains purpose, duration, payment, portability URI.

  • GDPR / CCPA / CPRA

    Instant erase, portability, "Do Not Sell" toggles honoured on‑chain.

  • SOC 2 & ISO 27001

    Audited encryption, access controls, incident response.

  • Revocation API

    `POST /licence/{token}/revoke` + Webhook to purge your cache in <1h.

Get Data in 3 Easy Steps

  1. 1

    Sign NDA & API key

    Less than 10 min—e‑signed & approved automatically.

  2. 2

    Pick a Bundle

    Download sample 1% slice or hit the firehose endpoint.

  3. 3

    Scale & Pay as You Go

    Metered billing via Stripe; autoscale to petabytes.

Ready to fuel your model with compliant human data?

Claim Your 5 GB Trial Dataset