diff --git a/_config.yml b/_config.yml
index d4916414195c9..9f9298f22dd72 100644
--- a/_config.yml
+++ b/_config.yml
@@ -3,14 +3,15 @@
#
# Name of your site (displayed in the header)
-name: Your Name
+name: johnsonwu
# Short bio or description (displayed in the header)
-description: Web Developer from Somewhere
+description: data mining & machine learning
# URL of your avatar or profile pic (you could use your GitHub profile pic)
avatar: https://raw.githubusercontent.com/barryclark/jekyll-now/master/images/jekyll-logo.png
+markdown: redcarpet
#
# Flags below are optional
#
diff --git a/_layouts/default.html b/_layouts/default.html
index b2939c0bc4483..ca5716827d267 100644
--- a/_layouts/default.html
+++ b/_layouts/default.html
@@ -11,7 +11,7 @@
-
+
diff --git a/_posts/2016-07-17-Spark-Logistic-Regression.md b/_posts/2016-07-17-Spark-Logistic-Regression.md
new file mode 100644
index 0000000000000..45482dd394bf4
--- /dev/null
+++ b/_posts/2016-07-17-Spark-Logistic-Regression.md
@@ -0,0 +1,31 @@
+---
+layout: post
+title: 逻辑回归
+---
+
+逻辑回归
+==
+原理
+--
+
+```java
+public class HelloWorld {
+ public static void main(String[] args) {
+ System.out.println("Hello, world")
+ }
+}
+
+```
+
+```c++
+#include
+using namespace std;
+int int main(int argc, char const *argv[]) {
+ cout << "Hello, World" << endl;
+ return 0;
+}
+```
+
+$$
+\Gamma(z) = \int_0^\infty t^{z-1}e^{-t}dt\,.
+$$
diff --git a/_posts/2016-07-23-spark-notes.md b/_posts/2016-07-23-spark-notes.md
new file mode 100644
index 0000000000000..4518d79c1f017
--- /dev/null
+++ b/_posts/2016-07-23-spark-notes.md
@@ -0,0 +1,290 @@
+## 1. Protocol Buffers
+1. provides a compiler and a set of libraries that a developer can use to serialize data
+2. developer defines the structure or schema of a dataset in a file and compiles is with the Protocol Buffers compiler, which generates the code that can then be used to easily read or write that data
+3. support language: c++, java, and python, primarily a data serialization format, can be used for defining remote services
+
+
+## 2. SequenceFile
+1. SequenceFile is a binary flat file format for storing key-value pairs.
+2. 3 different formats: uncompressed, record compressed, and block compressed, in record compressed, only the value in a record is compressed. whereas in a block compressed sequencefile, both keys and values are compressed
+
+
+## 3. columnar storage
+1. row-oriented storage is ideal for applications that mostly perform CRUD(create, read, update, delete) operations on data.
+2. row-oriented storage is not efficent for analytics applications
+3. row-oriented storae cannot be efficently compressed.
+
+## 4. RCFile
+1. RCFile first splits a table into row groups, and then stores each row group in column format.
+
+## 5. functional programming
+1. functional programming provides a tremendous boost in developer productivity.
+2. functional programming makes it easier to write concurrent or multithreaded applications
+3. functional programming helps to write robust code.
+4. functional programming helps to write elegant code.
+
+
+## 6. scala language
+#### 6.1 Variables
+1. mutable variables, declared using keyword var
+2. immutable variables, declared using keyword val
+
+#### 6.2 functions
+1. function defined using keyword def
+2. scala function example
+
+```java
+def add(firstInput: Int, secondInput: Int): Int = {
+ val sum = firstInput + secondInput
+ return sum
+}
+
+//example 2
+def add(firstInput: int, secondInput: Int) = firstInput + secondInput
+```
+
+#### 6.3 methods
+1. a method is a function that a member of an object. It defined like and works the same as a function. the only difference is that a method has access to all the fields of the object to which is belongs.
+
+#### 6.4 local functions
+1. a function defined inside another function or method is called local function.
+
+#### 6.5 higher-order methods
+1. a method that takes a function as an input parameter is called higher-order method.
+
+```java
+//example
+def encode(n: Int, f: (Int) => Long): Long = {
+ val x = n * 10
+ f(x)
+}
+```
+
+#### 6.6 function literals
+1. a function literal is an unnamed or anonymous function in source code. It can be used in an application just like a string literl. It can be passed as an input to a higher-order function method or function. It can also be assigned to a variable.
+
+2. a function literal is defined with input parameters in parenthesis, followed by a right arrow and the body of the function.
+```java
+//example code.
+(x: Int) => {
+ x + 100
+}
+
+//exmaple code
+val code = encode(10, (x: Int) => x + 100)
+```
+#### 6.7 closures
+1. the body of a function literals typically uses only input parameters and local variables defined within the function literal.
+
+```java
+//好难懂
+//例如encodedWithSeed(3,4), 先执行 y = 3 + 1000, 在执行(1003*4) 返回
+def encodedWithSeed(num: Int, seed: int): Long = {
+ def encode(x: Int, func: (Int) => Long): Long = {
+ val y = x + 1000
+ func(y)
+ }
+ val result = encode(num, (n: Int) => (n * seed))
+ result
+}
+```
+
+#### 6.8 classes
+1. consists of fields and methods.
+2. a class is a template or blueprint for creating objects at runtime.
+3. an object is an instance of a class.
+4. a class is defined in source code, whereas an object exists at runtime.
+
+```java
+//example code.
+class Car(mk: String, ml: String, cr: String) {
+ val make = ml
+ val model = ml
+ val color = cl
+
+ def repaint(newColor: String) = {
+ color = newColor
+ }
+}
+
+```
+
+#### 6.9 singletons
+1. a class that can be instantiated only once is called singleton.
+2. scala provides the keyword object for defining a singleton class.
+
+```java
+//exmaple code.
+object DatabaseConnection {
+ def open(name: String): Int = {
+
+ }
+
+ def read(streamId: Int): Array[Byte] = {
+
+ }
+ def close(): Unit = {
+
+ }
+}
+
+```
+
+
+#### 6.10 case class
+1. it creates a factory method with the same name.
+2. write an instance of a case class without using new.
+3. all input parameters specified in the definition of a case class implicityly get a val prefix.
+
+#### 6.11 pattern match
+1. replace multi-level if-else sttement.
+
+```java
+def colorToNumber(color: String): Int = {
+ val num = color match {
+ case "Red" => 1
+ case "Blue" => 2
+ case "Green" => 3
+ case "Yellow" => 4
+ case _ => 0
+ }
+ num
+}
+
+```
+
+#### 6.12 traits
+1. a trait reprenets an interface supported by a hierarchy of related class.
+2. a trait looks similar to an abstract class. Both can contain fields and methods.
+3. the key difference is that a class can inherit from only on class, but it can inherit from any number of traits.
+
+```java
+//example code.
+trait Shape {
+ def area(): Int
+ }
+
+ class Square(length: Int) extends Shape {
+ def area = length * length
+ }
+
+ class Rectangle(length: Int, width: Int) extends Shape {
+ def area = length * width
+ }
+```
+
+#### 6.13 tuples
+1. a tuple is a container for string two or more elements of different types. It is immutable, it cannot be modified after is has been created.
+2. is useful in situations where you want to group non-related elements.
+
+```java
+val twoElements = ("10", true)
+```
+
+
+#### 6.14 option type
+1. an option is a data type indicates the presence or absence of some data.
+2. option data type is used with a function or a method that optionally returns a value.
+3. it returns either Some(x), where x is the actual returned value, or the None object
+
+```java
+//example code
+def colorCode(color: String): Option[Int] = {
+ color match {
+ case "red" => Some(1)
+ case "blue" => Some(2)
+ case "black" => Some(3)
+ case _ => None
+ }
+}
+val code = colorCode("orange")
+code match {
+ case Some(c) => println("code for orange is: " + c)
+ case None => println("code not defined for orange")
+}
+```
+
+#### 6.15 List
+1. List is a linear sequence of elements of the same type.
+2. Although an element in a list can be accessed by its index, it is not an efficent data structure for acccessing eleemnts by their indices.
+
+```java
+//exmaple code
+val xs = List(10,20,30,40)
+```
+
+#### 6.16 Higher-order methods on collection classes
+##### 1. map
+
+```java
+//example code
+val xs = List(1,2,3,4)
+val ys = xs.map((x: Int) => x * 10)
+```
+
+##### 2. flatMap
+1. the flat map is similar to map
+2. it takes a function as input, applies to each element in a collection, and returns another collection as a result.
+
+##### 3. reduce
+1. the reduce method returns a single value. as name implies, it reduces a collection to a single value.
+2. the input function to the reduce takes two inputs at a time and returns one value.
+3. the input function is a binary operator that must be both associative and commutative.
+
+```java
+//example code.
+
+val xsForReduce = List(2,4,6,8,10)
+val sum = xsForReduce.reduce((x, y) => x + y)
+val product = xsForReduce.reduce((x, y) => x * y)
+val max = xsForReduce.reduce((x,y) => if (x > y) x else y)
+val min = xsForReduce.reduce((x,y) => if (x < y) x else y)
+```
+## 7. spark
+#### 7.1 spark is fast
+1. spark allows in-memory cluster computing.
+2. spark has an advanced job execution engine.
+#### 7.2 workers
+1. a worker provides CPU, memory, and storage resource to spark application.
+2. the workers run a Spark application as distributed processes on a cluster of nodes.
+
+#### 7.3 cluster mangers
+1. manages computing resources across a cluster of worker nodes.
+2. provides low-level scheduling of cluster resources accorss applications.
+3. enables multiple applications to share cluster resources and run on the same worker nodes.
+4. spark currently supports 3 managers: standalone, Mesos and YARN. Mesos and YARN allow run spark and hadoop applications on the same worker nodes.
+
+#### 7.4 executors
+1. an executor is a JVM process that spark creates on each worker for an application.
+2. it executes application code concurrently in multiple threads.
+
+#### 7.5 tasks
+1. a task is the smallest unit of work that spark sends to executor.
+2. executed by a thread in executor on a worker node.
+3. each task either return a result to a driver programe or patition its output for shuffle
+4. spark creates task for data partition
+
+#### 7.6 application execution
+1. shuffle. A shuffle redistributes data among a cluster of nodes. it is expensive because it involves moving data across a network. Notes: shuffle dose not randomly redistribute data, it groups data elements into buckets bases on some criteria.
+
+2. Job. A job is a set of computations that spark performs to return results to a driver program.
+
+3. Stage. A stage is a collection of tasks. spark splits a job into DAG stages.
+
+## 8. RDD operations
+#### 8.1 zip
+1. takes an RDD as input and returns an RDD of pairs. where the first element in a pair is from the source RDD and the second element is from the input RDD.
+
+#### 8.2 pipe
+1. allows execute an external program in a forked program
+
+#### 8.3 coalesce
+1. reduce the number of partitions of an RDD
+
+```java
+//example code
+val numbers = sc.parallelize((1 to 100).toList)
+val numbersWithOneRDD = numbers.coalesce(1)
+```
+
+#### 8.4 repartition
diff --git a/_sass/_highlights.scss b/_sass/_highlights.scss
index 57c7b72f07617..3e3d536fa6c9e 100644
--- a/_sass/_highlights.scss
+++ b/_sass/_highlights.scss
@@ -1,6 +1,6 @@
.highlight {
- background-color: #efefef;
+ background-color: #292525;
padding: 7px 7px 7px 10px;
border: 1px solid #ddd;
-moz-box-shadow: 3px 3px rgba(0,0,0,0.1);
@@ -81,4 +81,4 @@ code {
.highlight .vc { color: #268BD2 } /* Name.Variable.Class */
.highlight .vg { color: #268BD2 } /* Name.Variable.Global */
.highlight .vi { color: #268BD2 } /* Name.Variable.Instance */
-.highlight .il { color: #2AA198 } /* Literal.Number.Integer.Long */
\ No newline at end of file
+.highlight .il { color: #2AA198 } /* Literal.Number.Integer.Long */