Jun
07

The useful python libraries

contextlib.py┬á: can include the object in “with” and keep the memory small.

py4j: call java from python

May
02

Python vs. R

It is very difficult to say which one I like more.

Jupter vs. RMarkdown

  • RMarkDown: quick edit for beautiful report, compile is slow
  • Jupyter: console in web, flexible, but sometimes you will fall into chaos
  • Both can integrate different language

Speed:

  • Python > R

Community:

  • R has more users in no computer science community

Time Series:

  • R has very good tutorials and packages for time series.

Image processing:

  • Since Python is faster than R, Python is more suitable for image processing

Dataframe:

  • R: dplyr
  • Python: pandas

Conclusion:

Master the two languages, your will find many friends. ­čÖé

May
02

Install Jupyter Kernel

1. How to install ipython 3 kernel to jupyter?
+ install Anaconda 3
+ copy “Jupyter notebook” symbol to desktop
+ right click -> “property” -> “start in” change to the folder which you want to begin with

2. How to install IRKernel to Jupyter?
+ Open Anaconda 3 prompt
+ conda install -c r r-essentials
+ https://www.continuum.io/blog/developer/jupyter-and-conda-r

May
02

compare two lists

It is very easy to compare two lists in python, but it seems not so many people know it.

Just use the `==` symbol in numpy, it will return a numpy array with boolean value, True means the position are idential.

import numpy as np

aa = [“a”,”b”,”a”]
bb = [“b”,”b”,”a”]

np.array(aa)==np.array(bb)

>>array([False, True, True], dtype=bool)

see my example code in stackoverflow:

http://stackoverflow.com/a/36983379/3279996

May
02

break the long line in python

There are several methods to break the long string line in python. I think the triple-quoted “”” method is the easiest way. The reason is you can also for json object.

myJson =””” {
“result”:
[{“short_description”: “I am getting bluescreen error”,
“sys_id”: “39b5f8c2376ede007520021a54990e5c”,
“opened_at”: “2016-04-04 05:19:53”,
“number”:”INC0258523″
},
{
“short_description”: “laptop crashed with a blue screen”,
“sys_id”: “da0095380f43d200a4f941fce1050eeb”,
“opened_at”:”2016-04-25 06:33:52″,
“number”:”INC0259067″
},
{
“short_description”: “Laptop not booting”,
“sys_id”: “ecf9c9b00f43d200a4f941fce1050e17”,
“opened_at”: “2016-04-25 06:07:16”,
“number”: “INC0259061”
}]
}
“””
data = json.loads(myJson)

It works!

Mar
01

hide messages in RMD

If you write Rmd file, the messages are always very disturbing. There are two methods to avoid it:

1. Global method
{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE, warning = FALSE, message = FALSE)

2. Local method
suppressMessages(library(dplyr))

Feb
28

An Overview of Outlier Detection Techniques and Applications

Yesterday, I did the presentation in Machine Learning Rhein-Neckar Meetup.  It is so nice to meet people with the same loving in data mining.  The following is my presentation. The short version I used for the presentation and the long version with the notation is for myself.

Feb
16

change R version

link is from

https://support.rstudio.com/hc/en-us/articles/200486138-Using-Different-Versions-of-R

default R adress: $which R

/usr/bin/R

it can be

Force R Studio use the specific version:

export RSTUDIO_WHICH_R=/usr/local/bin/R

Jan
03

The easiest way to update nodejs and npm

Linux:

sudo npm cache clean -f
sudo npm install -g n
sudo n stable

Windows:

Methode 1:
1. Dowload and install the latest version of node.js
2. go the directory: C:\Program Files\nodejs
npm install -g npm@latest

Methode 2:
1. install chocolatey:
2. choco install nodejs (install node.js)
choco update nodejs (update node.js)
choco update npm

Methode 3:

Download the latest MSI from here, and run it

 

Dec
04

Highlights of the new RStudio releases

The new release of R Studio got many improvement. As I read them, I was very so excited about the new functions. Here are highlights for me.

  • Data can be filtered, searched, and sorted
  • Execute R code from the Source Viewer using Ctrl+Enter
  • Keyboard shortcut quick reference(Windows/Linux: Alt+Shift+K)
  • Alt + Enter to run code whilte retaining cursor position
  • Ctrl+Shift+E to select within matching parents/braces
  • Ctrl+Shift+M for magrittr pipe operator (%>%)
  • Ctrl+Alt+Shift+E to expand selection to matching paren/brace
  • dev.new() now creates a new desktop graphics device if the RStudio device is already active
  • Default to current working directory for new project from existing directory