File tree Expand file tree Collapse file tree 1 file changed +2
-2
lines changed Expand file tree Collapse file tree 1 file changed +2
-2
lines changed Original file line number Diff line number Diff line change @@ -42,7 +42,7 @@ Yes we could have ran our sqoop jobs on EMR clusters but we wanted to run everyt
42
42
avoid additional technology footprint. But even if we drop that restriction...
43
43
44
44
#### 2. ` sqoop ` does not support writing data directly to Delta Lake
45
- ` scoop ` can only import data as text or parquet. Writing to delta directly allows us to
45
+ ` sqoop ` can only import data as text or parquet. Writing to delta directly allows us to
46
46
optimize data storage for best performance on reads by just adding a couple of configuration options
47
47
48
48
``` shell script
@@ -57,7 +57,7 @@ spark-submit /
57
57
```
58
58
59
59
#### 3. ` --num-mappers ` just not good enough to control parallelism when working with a database
60
- ` sqooop ` uses map-reduce under the hood. We can specify ` --num-mappers ` parameter that controls how many
60
+ ` sqoop ` uses map-reduce under the hood. We can specify ` --num-mappers ` parameter that controls how many
61
61
mappers will be used to import data. Small number of mappers can result in large volume
62
62
of data per import and long running transactions. Large number of mappers will result in many connections
63
63
to database potentially overloading it especially when there are a lot of ` sqoop ` jobs running in parallel.
You can’t perform that action at this time.
0 commit comments