--- id: 5956795bc9e2c415eb244de1 title: Hash join challengeType: 5 forumTopicId: 302284 dashedName: hash-join --- # --description-- An [inner join](https://en.wikipedia.org/wiki/Join_(SQL)#Inner_join "wp: Join\_(SQL)#Inner_join") is an operation that combines two data tables into one table, based on matching column values. The simplest way of implementing this operation is the [nested loop join]( "wp: Nested loop join") algorithm, but a more scalable alternative is the [hash join]( "wp: hash join") algorithm. The "hash join" algorithm consists of two steps:
  1. Hash phase: Create a multimap from one of the two tables, mapping from each join column value to all the rows that contain it.
    • The multimap must support hash-based lookup which scales better than a simple linear search, because that's the whole point of this algorithm.
    • Ideally we should create the multimap for the smaller table, thus minimizing its creation time and memory size.
  2. Join phase: Scan the other table, and find matching rows by looking in the multimap created before.
In pseudo-code, the algorithm could be expressed as follows:
let A = the first input table (or ideally, the larger one)
let B = the second input table (or ideally, the smaller one)
let jA = the join column ID of table A
let jB = the join column ID of table B
let MB = a multimap for mapping from single values to multiple rows of table B (starts out empty)
let C = the output table (starts out empty)
for each row b in table B:
  place b in multimap MB under key b(jB)
for each row a in table A:
  for each row b in multimap MB under key a(jA):
    let c = the concatenation of row a and row b
    place row c in table C
# --instructions-- Implement the "hash join" algorithm as a function and demonstrate that it passes the test-case listed below. The function should accept two arrays of objects and return an array of combined objects. **Input**
A =
Age Name
27 Jonah
18 Alan
28 Glory
18 Popeye
28 Alan
B =
Character Nemesis
Jonah Whales
Jonah Spiders
Alan Ghosts
Alan Zombies
Glory Buffy
jA = Name (i.e. column 1) jB = Character (i.e. column 0)
**Output** | A_age | A_name | B_character | B_nemesis | | ----- | ------ | ----------- | --------- | | 27 | Jonah | Jonah | Whales | | 27 | Jonah | Jonah | Spiders | | 18 | Alan | Alan | Ghosts | | 18 | Alan | Alan | Zombies | | 28 | Glory | Glory | Buffy | | 28 | Alan | Alan | Ghosts | | 28 | Alan | Alan | Zombies | The order of the rows in the output table is not significant. # --hints-- `hashJoin` should be a function. ```js assert(typeof hashJoin === 'function'); ``` `hashJoin([{ age: 27, name: "Jonah" }, { age: 18, name: "Alan" }, { age: 28, name: "Glory" }, { age: 18, name: "Popeye" }, { age: 28, name: "Alan" }], [{ character: "Jonah", nemesis: "Whales" }, { character: "Jonah", nemesis: "Spiders" }, { character: "Alan", nemesis: "Ghosts" }, { character:"Alan", nemesis: "Zombies" }, { character: "Glory", nemesis: "Buffy" }, { character: "Bob", nemesis: "foo" }])` should return `[{"A_age": 27,"A_name": "Jonah", "B_character": "Jonah", "B_nemesis": "Whales"}, {"A_age": 27,"A_name": "Jonah", "B_character": "Jonah", "B_nemesis": "Spiders"}, {"A_age": 18,"A_name": "Alan", "B_character": "Alan", "B_nemesis": "Ghosts"}, {"A_age": 18,"A_name": "Alan", "B_character": "Alan", "B_nemesis": "Zombies"}, {"A_age": 28,"A_name": "Glory", "B_character": "Glory", "B_nemesis": "Buffy"}, {"A_age": 28,"A_name": "Alan", "B_character": "Alan", "B_nemesis": "Ghosts"}, {"A_age": 28,"A_name": "Alan", "B_character": "Alan", "B_nemesis": "Zombies"}]` ```js assert.deepEqual(hashJoin(hash1, hash2), res); ``` # --seed-- ## --after-user-code-- ```js const hash1 = [ { age: 27, name: 'Jonah' }, { age: 18, name: 'Alan' }, { age: 28, name: 'Glory' }, { age: 18, name: 'Popeye' }, { age: 28, name: 'Alan' } ]; const hash2 = [ { character: 'Jonah', nemesis: 'Whales' }, { character: 'Jonah', nemesis: 'Spiders' }, { character: 'Alan', nemesis: 'Ghosts' }, { character: 'Alan', nemesis: 'Zombies' }, { character: 'Glory', nemesis: 'Buffy' }, { character: 'Bob', nemesis: 'foo' } ]; const res = [ { A_age: 27, A_name: 'Jonah', B_character: 'Jonah', B_nemesis: 'Whales' }, { A_age: 27, A_name: 'Jonah', B_character: 'Jonah', B_nemesis: 'Spiders' }, { A_age: 18, A_name: 'Alan', B_character: 'Alan', B_nemesis: 'Ghosts' }, { A_age: 18, A_name: 'Alan', B_character: 'Alan', B_nemesis: 'Zombies' }, { A_age: 28, A_name: 'Glory', B_character: 'Glory', B_nemesis: 'Buffy' }, { A_age: 28, A_name: 'Alan', B_character: 'Alan', B_nemesis: 'Ghosts' }, { A_age: 28, A_name: 'Alan', B_character: 'Alan', B_nemesis: 'Zombies' } ]; const bench1 = [{ name: 'u2v7v', num: 1 }, { name: 'n53c8', num: 10 }, { name: 'oysce', num: 9 }, { name: '0mto2s', num: 1 }, { name: 'vkh5id', num: 4 }, { name: '5od0cf', num: 8 }, { name: 'uuulue', num: 10 }, { name: '3rgsbi', num: 9 }, { name: 'kccv35r', num: 4 }, { name: '80un74', num: 9 }, { name: 'h4pp3', num: 6 }, { name: '51bit', num: 7 }, { name: 'j9ndf', num: 8 }, { name: 'vf3u1', num: 10 }, { name: 'g0bw0om', num: 10 }, { name: 'j031x', num: 7 }, { name: 'ij3asc', num: 9 }, { name: 'byv83y', num: 8 }, { name: 'bjzp4k', num: 4 }, { name: 'f3kbnm', num: 10 }]; const bench2 = [{ friend: 'o8b', num: 8 }, { friend: 'ye', num: 2 }, { friend: '32i', num: 5 }, { friend: 'uz', num: 3 }, { friend: 'a5k', num: 4 }, { friend: 'uad', num: 7 }, { friend: '3w5', num: 10 }, { friend: 'vw', num: 10 }, { friend: 'ah', num: 4 }, { friend: 'qv', num: 7 }, { friend: 'ozv', num: 2 }, { friend: '9ri', num: 10 }, { friend: '7nu', num: 4 }, { friend: 'w3', num: 9 }, { friend: 'tgp', num: 8 }, { friend: 'ibs', num: 1 }, { friend: 'ss7', num: 6 }, { friend: 'g44', num: 9 }, { friend: 'tab', num: 9 }, { friend: 'zem', num: 10 }]; ``` ## --seed-contents-- ```js function hashJoin(hash1, hash2) { return []; } ``` # --solutions-- ```js function hashJoin(hash1, hash2) { const hJoin = (tblA, tblB, strJoin) => { const [jA, jB] = strJoin.split('='); const M = tblB.reduce((a, x) => { const id = x[jB]; return ( a[id] ? a[id].push(x) : (a[id] = [x]), a ); }, {}); return tblA.reduce((a, x) => { const match = M[x[jA]]; return match ? ( a.concat(match.map(row => dictConcat(x, row))) ) : a; }, []); }; const dictConcat = (dctA, dctB) => { const ok = Object.keys; return ok(dctB).reduce( (a, k) => (a[`B_${k}`] = dctB[k]) && a, ok(dctA).reduce( (a, k) => (a[`A_${k}`] = dctA[k]) && a, {} ) ); }; return hJoin(hash1, hash2, 'name=character'); } ```