|
| 1 | +### Master Slave Pattern. |
| 2 | +### How to check and handle system failure and error. |
| 3 | +### How to design Distributed File System. |
| 4 | + |
| 5 | + |
| 6 | +### Scenario |
| 7 | +- Peer to peer |
| 8 | + - Master + Slave |
| 9 | + - DB |
| 10 | + - Master: store data |
| 11 | + - Slave: Backup |
| 12 | + - HDFS |
| 13 | + - manager |
| 14 | + - store data, partiion |
| 15 | + - Interview Questions: |
| 16 | + - how disk to save file |
| 17 | + -  |
| 18 | + - How to save a large file in one machine? |
| 19 | + - EX: ECS 128M (chunk) |
| 20 | + - Advantage: reduce size of metadata |
| 21 | + - DisAdvantage: Waste space of small files |
| 22 | + - How to save extra-large file? |
| 23 | + -  |
| 24 | + - 每个chunk的Offset偏移量可不可以不存在master上面? |
| 25 | + -  |
| 26 | + - Master 存储10P 文件的metadata需要多少容量? |
| 27 | + 10G |
| 28 | + - Interviewer: How to write a file? |
| 29 | + - 一次 VS 多次 |
| 30 | + - 写入过程中出错了,那么需要重新写入,哪一种方法更好? 一次传输得重新传输整个文件,多次只用重新传一小份。 |
| 31 | + - 如果是分成多份多次写入,那么每一份的大小? 文件本来是按照Chunk来存储的,所以传输单位也是Chunk |
| 32 | + -  |
| 33 | + - Clint asks chunk 个数 to master |
| 34 | + - master reutrn the chunk index |
| 35 | + - clinet finds the chunk from chunkServer |
| 36 | + - 要修改xxx.mp4怎么办? |
| 37 | + - One time to write, Many time to read. |
| 38 | + - 先删掉/gfs/home/dengchao.mp4 |
| 39 | + - 重新把整个文件重写一份 |
| 40 | + - How to read a file? |
| 41 | + -  |
| 42 | + |
| 43 | + |
| 44 | +### Our work solution |
| 45 | + |
| 46 | + |
| 47 | +### Scale about the Failure and Recover |
| 48 | +- How to identify whether a chunk on the disk is broken? |
| 49 | + - Checksum |
| 50 | + - 什么时候写入checksum? |
| 51 | + - 写入一块chunk的时候顺便写入 |
| 52 | + - 什么时候检查checksum? |
| 53 | + - 读入这一块数据的时候检查 |
| 54 | + - 重新读数据并且计算现在的checksum |
| 55 | + - 比较现在的checksum和之前存的checksum是否一样 |
| 56 | + |
| 57 | +- How to avoid chunk data loss when a Chunk Server is down/fail? |
| 58 | + - Replica |
| 59 | + |
| 60 | +- How to recover when a chunk is broken? & How to recover when a chunk is broken? |
| 61 | + - Ask master for help  |
| 62 | + |
| 63 | +- How to find whether a Chunk Server is down? |
| 64 | + - Answer: Heart Beat |
| 65 | + - chunk servers->master? |
| 66 | + |
| 67 | +- Scale about the Write |
| 68 | + - [soultion 1](./assets/chap1_35.png) |
| 69 | + - [Better soultion](./assets/chap1_36.png) |
| 70 | + - 怎么样选队长? |
| 71 | + - 找距离最近的(快) |
| 72 | + - 找现在不干活的(平衡traffic) |
| 73 | + |
| 74 | + - How to solve Chunk Server failure? |
| 75 | + - retry (重新分配) |
0 commit comments