Bigs Tutorial
From hpcwiki
Paralelization
- Trivial scalability
- Hadoop Bottelneck: shuffling process -> high information exchange between nodes -> high network load
- Mahaut: iterative processes on Hadoop
BIGS
- Interativeness: function requires new iteration
- Java Generics: compilation time correct type management
- Bigs does not have a 'central' node
- Why No-SQL? Less expressivity -> more scalability
- Key rows, not a fixed column set for each entry (row)
- Scan by key
- Access by key
- No joins
- No transactions, operations reduced to check and put