Twitter

Monday, November 07, 2016

Postdoc position in Applied Statistics and Bayesian Modeling available immediately

A full-time postdoctoral position is available beginning immediately in the research group of Professor Tian Zheng working on reproducible modeling of social and behaviorial data using Bayesian computing, in close cooperation with our collaborators in social sciences. 

Requirements: The work is highly interdisciplinary, and applicants must have strong statistical and computational skills. Preferred educational background is a PhD in statistics, computer science, computational social science or a related field. Expertise in statistical computing is required. Experience with R/Stan, Python, and/or data visulization is preferred but not necessary. 

Environment: The research group is at Columbia University, based in the Statistics department, in the great city of New York. There will also be the opportunity to interact with our collaborators on other research projects involving psychology, genetics, and neural science.

Appointment: The initial appointment will be for one year, and is renewable. Salaries will be set based on experience and skills.

Applicants should send email to tzheng@stat.columbia.edu providing:
• a brief description of past research experience
• a brief description of future research interests and goals
• a resume of educational and research experience, including publications
• three letters of reference


Wednesday, September 28, 2016

ASA SLDS JSM 2017 student paper competition

Call for papers
Student Paper Competition - JSM 2017
(July 29th-Aug 3rd, 2017, Baltimore, MD)
ASA Section on Statistical Learning and Data Science
Sponsored by SLDS

Key dates:
• Abstracts due December 15th, 2016
• Full papers due January 4th, 2017

The Section on Statistical Learning and Data Science (SLDS) of the American Statistical Association (ASA) is sponsoring a student paper competition for the 2017 Joint Statistical Meetings in Baltimore, MD, on July 29th-August 3rd, 2017.

The paper might be an original methodological research or a real-world application (from various fields including but not limited to marketing, pharmaceutical, genomics, bioinformatics, imaging, defense, business, public health) that uses principles and methods in statistical learning and data science.

Papers that have been accepted for publication are NOT eligible for the competition. Selected winners will present their papers in a designated topic-contributed session at the 2017 JSM in Chicago, IL organized by the award committee. In this session, they will be presented a monetary prize and an award certificate. Winning papers will be recommended for submission to Statistical Analysis and Data Mining: The ASA Data Science Journal, which is the flagship journal of the SLDS Section.

Graduate or undergraduate students who are enrolled in Fall 2016 or Winter/Spring 2017 are eligible to participate. The applicant MUST be the first author of the paper.

Abstracts (up to 1000 characters) are due 12:00 PM (noon) EST on December 15th, 2016 and shall be submitted via this Abstract submission form (https://goo.gl/forms/jkPrjVuRbeLUufGm2). ONLY students who submit their abstracts on time are eligible for submitting full papers after 12/15/2016.

Full papers and other application materials must be submitted electronically (in PDF, see instruction below) to Professor Tian Zheng (tian.zheng@columbia.edu) by 12:00 PM (noon) EST on Wednesday, January 4th, 2017. ONLY students who submit their abstracts by 12/15/2016 are eligible for submitting full papers.

All full paper email entries must include the following:
  1. An email message contains:
    • List of authors and contact information;
    • Abstract with no more than 1000 characters.
  2. Unblinded manuscript - double-spaced with no more than 25 pages including figures, tables, references and appendix. Please use 11pt fonts (preferably Arial or Helvetical) and 1 inch margins all around.
  3. Blinded versions of the abstract and manuscript (with no authors nor references that could easily lead to author identification).
  4. A reference letter from a faculty member familiar with the student's work which MUST include a verification of the applicant's student status and, in the case of joint authorship, should indicate the fraction of the applicant's contribution to the manuscript.
All materials must be in English.

Entries will be reviewed by the Student Paper Competition Award committee. The selection criteria used by the committee will include statistical novelty, innovation and significance of the contribution to the field of application as well as the professional quality of the manuscript.

This year’s student competition is sponsored ASA SLDS and is chaired by Professor Tian Zheng (Columbia University). Award announcements will be made in mid-January 2017. For inquiries, please contact Professor Tian Zheng (tian.zheng@columbia.edu).

Saturday, September 10, 2016

Resources for data analytics at Columbia

I got an email asking for resources (other than courses) for data analytics at Columbia. This is what I wrote in reply:

Monday, September 05, 2016

ADS alum Yuhan Sun's summer intern at UNICEF

An Spring 2016 alum from our Applied Data Science course , Yuhan Sun (MA in Statistics, Columbia University), spent the past summer as a data scientist at UNICEF. She extended a Shiny app that provides a web-based application for generating child mortality estimates. These estimates are computed from empirical data using the United Nations Inter-agency Group for Child Mortality Estimation (UN IGME) methodology. According to Ms. Lucia Hug from UNICEF,
"The UN IGME methodology applies a curve fitting method to derive trend estimates by using a Bayesian B-splines bias-reduction model to empirical of under-five and infant mortality rates. The method also extrapolates the trend estimates to a defined time point."
In this project, Yuhan applied her skills on Shiny app development learnt during our ADS course to meet the needs of this task. She said:
"Shiny provides a feasible approach for non-cs people to build a web application. It is an effective and time efficient way to build an application which uses the existing R codes for the Bayesian B-spline bias-reduction model. It also enables users to use the UN IGME methodology without being familiar with running R codes. It allows to apply the model to new empirical data and to review the new estimates graphically. It also offers the possibility to visualize the results and to adjust parameters according to the users’ needs."
Ms. Hug said: "Mortality rates among young children are a key output indicator for child health and well-being, and, more broadly, for social and economic development. It is a closely watched public health indicator because it reflects the access of children and communities to basic health interventions such as vaccination, to medical treatment of infectious diseases and to adequate nutrition."

Yuhan is also working on some cool visualizations as part of her internship.

Saturday, September 03, 2016

A happy birthday in numbers

On my birthday, I received a total of 65 "happy birthday!" messages (in either Chinese or English) via social messaging. For the first year, I got birthday wishes from LinkedIn. I thought it'd be fun to visualize using Tableau some basic information about these social messages, as a snapshot of a subset of my social network.
  • More than 3/4 of my birthday wishing social network are Chinese.
  • The biggest three groups in my current online social networks of nice people are via family, via work, and via kids, which are the three main (if not ONLY) occasions of my social interactions. 
  • My Chinese friends are active on Wechat and my non-Chinese friends are mostly on facebook.
  • Facebook and LinkedIn both post reminders about upcoming birthdays. From the trends, it seems that LinkedIn users are only active in the morning or in the evening, while Facebook users (at least those among my FB friends) are more active during the day and around meal times. 


Thursday, September 01, 2016

Applied Data Science

Another semester of data science fun is just around the corner. 
Follow us at TZStatsADS.


Wednesday, August 17, 2016

Postdoc position available immediately

Postdoctoral position in spatiotemporal data analysis

A full-time postdoctoral position is available beginning immediately in the research group of Professor Tian Zheng working on analysis of large spatiotemporal data sets, in close cooperation with our collaborators in neural imaging. 

Requirements: The work is highly interdisciplinary, and applicants must have strong statistical and computational skills. Preferred educational background is a PhD in statistics, computer science, computational neural science or a related field. Expertise in high performance computing is required. Experience with GPU computing is preferred but not necessary. 

Environment: The research group is at Columbia University, based in the Statistics department, in the great city of New York. There will also be the opportunity to work with our collaborators on other research projects involving statistical computing, genetics, and computational social science.

Appointment: The initial appointment will be for one year, and is renewable. Salaries will be set based on experience and skills.

Applicants should send email to tzheng@stat.columbia.edu providing:
• a brief description of past research experience
• a brief description of future research interests and goals
• a resume of educational and research experience, including publications
• three letters of reference

Wednesday, May 04, 2016

Call for papers--Statistical Learning for Data Science (SLDS)

Call For Papers 

(New deadline: June 12, 2016)

Special Session on

"Statistical Learning for Data Science (SLDS)"

In DSAA 2016: The 3rd IEEE International Conference on Data Science and Advanced Analytics

Montreal, Canada October 17-19, 2016

Organized by

  • Tian Zheng (Department of Statistics, Columbia University)
  • Wei Pan (Department of Biostatistics, University of Minnesota)
  • Hernando Ombao, (Department of Statistics, University of California at Irvine) 

Program Committee members

  • Genevera Allen (Department of Statistics, Rice University) 
  • Ke Deng (Center for Statistical Science, Tsinghua University)
  • Charles Doss (School of Statistics, University of Minnesota)
  • Bailey Fosdick (Department of Statistics, Colorado State University)
  • Mladen Kolar (University of Chicago, Booth School of Business)
  • Xi Luo (Department of Biostatistics and Center for Statistical Sciences, Brown University)
  • Ali Shojaie (Department of Biostatistics, University of Washington)
  • Gongjun Xu (School of Statistics, University of Minnesota)
  • Alexander Volfovsky (Department of Statistical Science, Duke University)
  • Sijian Wang (Department of Statistics, University of Wisconsin at Madison)
Statistics plays a central role in the data science approach. This special session is to engage discussion from statisticians who study methods and theory that are fundamental to data science. Paper submissions on recent advances in statistical learning and modeling for complex data are encouraged.
Topics of interests are, but not limited to,
  • Advances in theory or models associated with the analysis of massive, complex datasets;
  • Statistical modeling and data mining for data-driven solutions of real-world problems;
  • Innovative data mining algorithms or novel statistical approaches;
  • Comparison of techniques to solve a problem, along with an objective evaluation of the analyses and the solutions.
Conference content will be submitted for inclusion into IEEE Digital Library. The conference proceedings will be submitted for EI indexing through INSPEC by IEEE. Top quality papers accepted and presented at the conference will be selected for extension and publication in the special issues of some international journals, including IEEE TKDE, ACM TKDD, ACM TIIS and WWWJ.

Journal publication

Extended versions of accepted papers to this special session will be considered for a special issue of Statistical Analysis and Data Mining, the ASA data science journal.

Key dates

  • Paper Submission deadline: Sunday 12 June, 2016, 11:59 PM PDT
  • Notification of acceptance: 15 July, 2016
  • Final Camera-ready papers due: 19 August, 2016

Submission Instruction

Monday, January 18, 2016

Project-based learning (PBL)

This semester I will be teaching Applied Data Science (W4249), a project-based learning (PBL) course. I came up with the idea for this course without being award of this line of discussion in education innovation. As I have been preparing for this course, I started looking for research on instructional practices using projects. I have discovered some very nice discussion on PBL. PBL has been more widely discussed in K-12 education, especially in STEM curriculums. One of the main objective of PBL is to guide students in the process of turning into self-directed life learners. This is quite appealing as one can never acquire, within their time in school, all the knowledge required to solve all the problems.