x

Bacalhau Project

Data. Transformed.

Simple, low cost, “Distributed First” tools that unlock an open, collaborative ecosystem.

Why Bacalhau?

Bacalhau seeks to address deep rooted gaps in the research community. Existing data processing infrastructure are:

  • Expensive and slow, particularly with large datasets
  • Complicated that require significant rewriting to use
  • Difficult to share and collaborate with other organizations

Bacalhau transforms Big Data processing by giving developers simple, low cost, decentralized tools that unlock a new collaborative ecosystem

  • Fast and reliable - the network runs the jobs where the data is already stored
  • Simple and familiar - use the tools you already know and love (Docker, GNU, Python, R, Matlab, WASM)
  • Collaborative - All content can be shared using the globally distributed IPFS network

How does it work?

Bacalahau is a network of open compute resources made available to serve any data processing workload.

Bacalhau enables users to run arbitrary docker containers and wasm images against data stored in IPFS. Bacalhau is a peer-to-peer network of nodes where each node participates in executing (computing) jobs submitted to the cluster.

This architecture is referred to as Compute Over Data (or COD). The Portuguese word for salted Cod fish is Bacalhau, which is the origin of the project's name.

Submitting Jobs is Easy

$ bacalhau docker run ubuntu echo hello

$ bacalhau list

$ bacalhau get CID