Skip to content

Commit 813e4c5

Browse files
Updated README
Added a few more resources to the README, and reformatted it to be easier to read.
1 parent 6cb8556 commit 813e4c5

File tree

1 file changed

+60
-75
lines changed

1 file changed

+60
-75
lines changed

README.md

Lines changed: 60 additions & 75 deletions
Original file line numberDiff line numberDiff line change
@@ -23,11 +23,11 @@ This repository contains the files and lectures for the _[Insert Title or Organi
2323
* Need to demo how to create a twitter developer account to the twitter oauth token
2424
* Need to set the logging level to ERROR to reduce the output noise ([Video Example](https://youtu.be/vLrcjVxdTng0))
2525
* Code samples [one](https://drive.google.com/open?id=0Bym8DZ5hyGifdmRKMVR1QVlKNW8) and [two](https://drive.google.com/open?id=0Bym8DZ5hyGifX2t3UXNHc0RpWDA), don’t copy and paste these
26-
* We should point out winutils.exe needs to be installed for Windows users in order to run Spark applications.
26+
* We should point out [winutils.exe needs to be installed for Windows users in order to run Spark applications](https://docs.google.com/document/d/1bAsB0ZBjXGQ4md0Z3LIeaHosPQ1eFUmiIdU_FdanPqE/edit).
2727

2828
### Section 2: Spark Streaming Basics
2929
1. What are Discretized Streams
30-
* Reference: https://spark.apache.org/docs/latest/streaming-programming-guide.html#discretized-streams-dstreams
30+
* [Reference](https://spark.apache.org/docs/latest/streaming-programming-guide.html#discretized-streams-dstreams)
3131
* Use some graph to explain DStreams
3232
2. How to create Discretized Streams
3333
* Different ways to create DStreams
@@ -36,93 +36,79 @@ This repository contains the files and lectures for the _[Insert Title or Organi
3636
* Kafka
3737
* Flume
3838
* Kinesis
39-
* Twitter
40-
* Reference: https://spark.apache.org/docs/latest/streaming-programming-guide.html#basic-sources
41-
* Revisit your first Spark application
42-
* DEMO: Queue of RDDs as a Stream
39+
* Twitter ([Reference](https://spark.apache.org/docs/latest/streaming-programming-guide.html#basic-sources), revisit your first Spark application)
40+
* **DEMO:** Queue of RDDs as a Stream
4341
3. Transformations on DStreams
4442
* Basic RDD transformations(stateless transformation): `Map`, `flatMap`, `Filter`, `Repartition`, `Union`, `Count`, `Reduce`, `countByValue`, `reduceByKey`, `Join`, `Cogroup`
45-
* DEMO: Pick up 2 of the transformations to demo in the program
46-
* EXERCISE: prepare an exercise for student to use one of the transformations
43+
* **DEMO:** Pick up 2 of the transformations to demo in the program
44+
* **EXERCISE:** prepare an exercise for student to use one of the transformations
4745
4. Transform Operation
4846
* What is transform operation and the benefit of it ([Reference](https://spark.apache.org/docs/latest/streaming-programming-guide.html#transform-operation))
49-
* DEMO: do a demo with Transform Operation
50-
* EXERCISE: prepare an exercise for student to use transformation operation
47+
* **DEMO:** do a demo with Transform Operation
48+
* **EXERCISE:** prepare an exercise for student to use transformation operation
5149
5. Window Operations
5250
* What is Window Operations(better with some graphs)
5351
* Explain parameters (window length and sliding interval)
54-
* Some of the popular Window operations
55-
6. Window
56-
* countByWindow
57-
* reduceByKeyAndWindow
58-
* countByValueAndWindow
59-
* Window
60-
* Explain Window transformation in depth and what is the usage of Window function
61-
* DEMO: Do a demo with Window transformation
62-
* EXERCISE: Give an exercise about Window tansformation
63-
7. countByWindow
64-
* Explain countByWindow transformation in depth and what is the usage of countByWindow function
65-
* DEMO: Do a demo with countByWindow transformation
66-
* EXERCISE: Give an exercise about countByWindow tansformation
67-
8. reduceByKeyAndWindow
68-
* Explain reduceByKeyAndWindow transformation in depth and what is the usage of reduceByKeyAndWindow function
69-
* DEMO: Do a demo with reduceByKeyAndWindow transformation
70-
* EXERCISE: Give an exercise about reduceByKeyAndWindow tansformation
71-
9. countByValueAndWindow
72-
* Explain countByValueAndWindow transformation in depth and what is the usage of countByValueAndWindow function
73-
* DEMO: Do a demo with countByValueAndWindow transformation
74-
* EXERCISE: Give an exercise about countByValueAndWindow tansformation
52+
* Some of the popular Window operations (e.g., `Window`, `countByWindow`, `reduceByKeyAndWindow`, `countByValueAndWindow`)
53+
6. `Window`
54+
* Explain `Window` transformation in depth and what is the usage of `Window` function
55+
* **DEMO:** Do a demo with `Window` transformation
56+
* **EXERCISE:** Give an exercise about `Window` tansformation
57+
7. `countByWindow`
58+
* Explain `countByWindow` transformation in depth and what is the usage of `countByWindow` function
59+
* **DEMO:** Do a demo with `countByWindow` transformation
60+
* **EXERCISE:** Give an exercise about `countByWindow` tansformation
61+
8. `reduceByKeyAndWindow`
62+
* Explain `reduceByKeyAndWindow` transformation in depth and what is the usage of `reduceByKeyAndWindow` function
63+
* **DEMO:** Do a demo with `reduceByKeyAndWindow` transformation
64+
* **EXERCISE:** Give an exercise about `reduceByKeyAndWindow` tansformation
65+
9. `countByValueAndWindow`
66+
* Explain `countByValueAndWindow` transformation in depth and what is the usage of `countByValueAndWindow` function
67+
* **DEMO:** Do a demo with `countByValueAndWindow` transformation
68+
* **EXERCISE:** Give an exercise about `countByValueAndWindow` tansformation
7569
10. Output Operations on DStreams
76-
* Different output operation
77-
* Print
78-
* saveAsTextFiles
79-
* saveAsObjectFiles
80-
* saveAsHadoopFiles
81-
* foreachRDD
82-
* DEMO: Demo how to save tweets to files
83-
* Example: https://drive.google.com/open?id=0Bym8DZ5hyGifaXgwWFQxdVQ4UzA
84-
* use foreachRDD and saveAsTextFiles
85-
11. foreachRDD
86-
* Explain foreachRDD and the basic usage about foreachRDD
87-
* Design Patterns for foreachRDD
70+
* Different output operation (e.g., `Print`, `saveAsTextFiles`, `saveAsObjectFiles`, `saveAsHadoopFiles`, `foreachRDD`)
71+
* **DEMO:** Demo how to save tweets to files ([Example](https://drive.google.com/open?id=0Bym8DZ5hyGifaXgwWFQxdVQ4UzA))
72+
* use `foreachRDD` and `saveAsTextFiles`
73+
11. `foreachRDD`
74+
* Explain `foreachRDD` and the basic usage about `foreachRDD`
75+
* Design Patterns for `foreachRDD`
8876
* Reference: https://spark.apache.org/docs/latest/streaming-programming-guide.html#design-patterns-for-using-foreachrdd
89-
* DEMO: Do a demo with foreachRDD
90-
* EXERCISE: Give an exercise about foreachRDD
77+
* **DEMO:** Do a demo with `foreachRDD`
78+
* **EXERCISE:** Give an exercise about `foreachRDD`
9179
12. SQL OPERATIONS
92-
* https://spark.apache.org/docs/latest/streaming-programming-guide.html#dataframe-and-sql-operations
93-
* DEMO: Do a demo with SQL OPERATIONS
94-
* EXERCISE: Give an exercise about SQL OPERATIONS
80+
* [Dataframe and SQL Operations](https://spark.apache.org/docs/latest/streaming-programming-guide.html#dataframe-and-sql-operations)
81+
* **DEMO:** Do a demo with SQL OPERATIONS
82+
* **EXERCISE:** Give an exercise about SQL OPERATIONS
9583

9684
### 3. Section: Advanced
9785
1. Join Operations
9886
* Different types of Join
9987
* Stream-stream joins
10088
* Stream-dataset joins
101-
* DEMO: Do a demo with Stream-stream joins
102-
* DEMO: Do a demo with Stream-dataset joins
103-
* EXERCISE: Give an exercise with Stream-stream joins or Stream-dataset joins
89+
* **DEMO:** Do a demo with Stream-stream joins
90+
* **DEMO:** Do a demo with Stream-dataset joins
91+
* **EXERCISE:** Give an exercise with Stream-stream joins or Stream-dataset joins
10492
2. Stateful transformation
10593
* Transformations
106-
* UpdateStateByKey
107-
* mapWithState
108-
* DEMO Do a demo with UpdateStateByKey or mapWithState
109-
* Needs come up with a proper scenario to use mapWithState or UpdateStateByKey, such as some web session data.
110-
* EXERCISE: Prepare an exercise with UpdateStateByKey or mapWithState
94+
* `UpdateStateByKey`
95+
* `mapWithState`
96+
* **DEMO** Do a demo with `UpdateStateByKey` or `mapWithState`
97+
* Needs come up with a proper scenario to use `mapWithState` or `UpdateStateByKey`, such as some [web session data](https://drive.google.com/file/d/0Bym8DZ5hyGifWTJkQW5laUdwRU0/view).
98+
* **EXERCISE:** Prepare an exercise with UpdateStateByKey or mapWithState
11199
3. Check point
112100
* What is checkpoint and why use check point
113-
* Different types of checkpoint
114-
* Metadata checkpointing
115-
* Data checkpointing
101+
* Different types of checkpoint (Metadata checkpointing & Data checkpointing)
116102
* When to enable Checkpointing
117103
* How to configure Checkpointing
118-
* DEMO: Do a demo with Checkpointing
119-
* EXERCISE: Give Exercise with Checkpointing
104+
* **DEMO:** Do a demo with Checkpointing
105+
* **EXERCISE:** Give Exercise with Checkpointing
120106
4. Accumulators
121107
* What is Accumulators and usage of Accumulators
122-
* DEMO: Do a demo with Accumulators
123-
* EXERCISE: Give an Exercise with Accumulators
108+
* **DEMO:** Do a demo with Accumulators
109+
* **EXERCISE:** Give an Exercise with Accumulators
124110
5. Fault-tolerance
125-
* https://spark.apache.org/docs/latest/streaming-programming-guide.html#fault-tolerance-semantics
111+
* [Fault Tolerance Semantics](https://spark.apache.org/docs/latest/streaming-programming-guide.html#fault-tolerance-semantics)
126112

127113
### Section 4: More about Spark streaming
128114
1. Performance Tuning
@@ -137,33 +123,32 @@ This repository contains the files and lectures for the _[Insert Title or Organi
137123
2. Integration with Kafka
138124
* Introduction to Kafka
139125
* Why integrate with Kafka
140-
* DEMO: Demo
126+
* DEMO: [Demo](https://drive.google.com/file/d/0Bym8DZ5hyGifcnU1ZVVteEI3X1U/view?usp=drive_web)
141127
3. Integration with Kinesis
142128
* Introduction to Kinesis
143129
* Why integrate with Kinesis
144-
* DEMO: Demo
130+
* **DEMO:** [Demo](https://drive.google.com/file/d/0Bym8DZ5hyGifX2JNdFZENUpiRXM/view)
145131

146132
### Section 5: Structured Streaming
147133
1. Introduction about Structured Streaming
148134
* Overview of Structured Streaming
149-
* The Benefit of structured streaming
135+
* [The Benefit of structured streaming](https://drive.google.com/file/d/0Bym8DZ5hyGifM2VOYlJVQ3NwaTg/view)
150136
* [Basic Concepts about Spark streaming](https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#basic-concepts)
151-
* [DEMO: A quick demo about an structured streaming example](https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#quick-example)
152-
137+
* **DEMO:** [A quick demo about an structured streaming example](https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#quick-example)
153138
2. Operations on streaming DataFrames/Datasets
154139
* [Structured Streaming Programming Guide: Operations on Streaming Dataframe Datasets](https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#operations-on-streaming-dataframesdatasets)
155-
* DEMO: DO a demo:
156-
* EXERCISE: Prepare an excise
140+
* **DEMO:** DO a demo:
141+
* **EXERCISE:** Prepare an excise
157142
3. Window Operations
158143
* [Structured Streaming Programming Guide: Window Operations on Event Time](https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#window-operations-on-event-time)
159-
* DEMO: Do a demo ([exmaple](https://drive.google.com/open?id=0Bym8DZ5hyGifU2YzUmx3aldVdkU))
160-
* EXERCISE: Prepare an excise
144+
* **DEMO:** Do a demo ([exmaple](https://drive.google.com/open?id=0Bym8DZ5hyGifU2YzUmx3aldVdkU))
145+
* **EXERCISE:** Prepare an excise
161146
4. Handling Late Data and Watermarking
162-
* https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#handling-late-data-and-watermarking
147+
* [Handling Late Data and Watermarking Example](https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#handling-late-data-and-watermarking)
163148

164149
### Section 6: Finish up
165150
1. Add an introductory lecture about that is covered in the course
166-
* this video should be placed as the first lecture of this course, but we do it after we are done creating this course
151+
* This video should be placed as the first lecture of this course, but we do it after we are done creating this course
167152
2. Add a promotion video
168153
* This will be about what users will learn from this lecture and how they will benefit
169154
3. Finish up lecture

0 commit comments

Comments
 (0)