The Department of Data Science is the newest addition to the Ying Wu College of Computing. It was founded by well established, prominent researchers and educators with outstanding track records in Artificial Intelligence, Machine Learning, High Performance Data Analytics, Security/Privacy/Ethics in Data Science, Health Data Science, Green Data Science, and Data Visualization. The Department of Data Science was founded in 2021. The M.S. degree program in Data Science is jointly administered by the Department of Data Science and the Department of Mathematical Sciences. This degree program responds to a strong demand from employers for trained Data Scientists. Data is revolutionizing most industries and M.S. graduates in Data Science command high starting salaries.

Data Science combines powerful methods from Computer Science, Statistics, Artificial Intelligence and Machine Learning into a unique new blend of techniques for deriving valuable insights from Big Data. Data Science is an ideal choice for students who are interested in applying data processing methods to ever larger and more varied real-world data sets, including image, video, natural language and speech data that go substantially beyond traditional text and table data to solve real-world problems. The Department of Data Science closely collaborates with the Department of Mathematical Sciences and the Department of Computer Science. Students also can get involved in state-of-the-art research projects at the NJIT Institute for Data Science, where top notch scientists work with users to develop data-driven technologies to innovate the way the world works and lives.

Master of Science in Data Science

The Master of Science (M.S.) in Data Science (DS) is intended for students who are interested in pursuing advanced studies in data science.

Admission Requirements

  • GPA
    • Undergraduate GPA of at least 3.0 out of 4.0 is required for students with a data science, applied statistics, or computer science background.
    • Undergraduate GPA of at least 3.0 out of 4.0 is required for students without a data science, applied statistics, or computer science background. Students wishing to pursue the computing track who have an insufficient computing background will be asked to enroll in a relevant Certificate Program and obtain a GPA of at least 3.0 before being admitted to the M.S. program.  Students wishing to pursue the statistics track with an insufficient mathematics/statistics background will be asked to successfully complete suitable bridge courses as per the advisor’s review.
       
    • Foreign students without GPA must have graduated “first class,” corresponding to a B average.
  • International students TOEFL score: the Institute requires a minimum score of 213 paper based or 79 online.
  • International students: GRE required.
  • Students with a US or Canadian degree in data science, computer science, mathematical sciences, or engineering: GRE recommended but not required.
  • Students with a US or Canadian degree not in data science, computer science, mathematical sciences, or engineering: GRE required.

Students are expected to have good programming skills and a grasp of the fundamentals of computer science, data science, and the mathematical sciences (students should have acquired this knowledge in the undergraduate degree Bachelor of Science in Data Science, Applied Statistics, or Computer Science or an equivalent degree). Detailed topics are listed below.

Applicants to the computing concentration lacking the computing background should first enroll in one of the three associated Data Science Certificates (Data Mining, Data Visualization, Big Data), and, upon successful completion of the Certificate, apply for transfer into the M.S. in DS program – computing concentration.  Applicants to the statistics concentration with insufficient background in mathematics/statistics will be asked to complete suitable bridge courses as per the advisor’s review.
Students must maintain a cumulative graduate GPA of 3.0 or better throughout the course of studies and for graduation.

Application Processing

The Departments of Data Science and Mathematical Sciences review only completed applications submitted to the Office of Graduate Admissions. Applicants are advised to request status information on their application directly from the Graduate Admissions Office, not the Departments of Data Science or Mathematical Sciences. Graduate Admissions can be reached at admissions@njit.edu or www.njit.edu/gadmission or by mail at

NJIT, Graduate Admissions Office, University Heights, Newark NJ 07102.

Detailed Topics:

Students entering the M.S. in DS program are expected to have mastered the following topics:  Basic programming constructs, writing and debugging programs, iteration, recursion; basic data structures (lists, arrays, hash tables), search and sort, algorithm analysis; basic probability distributions and statistical analysis; linear algebra,  calculus (derivatives, integrals, applications, functions of multiple variables).

B

Bader, David, Distinguished Professor

D

Dasgupta, Aritra, Assistant Professor

G

Geller, James, Professor

P

Phan, Hai, Assistant Professor

R

Roshan, Usman, Associate Professor

W

Wu, Chase, Professor

DS 632. Computing with Advanced Data Representations. 3 credits, 3 contact hours.

Prerequisites: CS 631 or knowledge of SQL, and good knowledge of a high-level programming language. Covers the rapidly changing concepts and principles of modern database systems and database programming. Topics include relational databases, object-oriented databases and object-oriented data modeling, XML data, JSON data, Graph Databases, NO-SQL databases, programming with stored procedures, database security and access control, indexing in SQL and NO-SQL databases, and current trends in databases (e.g., serverless cloud databases). Example systems used include Oracle with PL/SQL, MongoDB, and Neo4j.

DS 636. Data Analytics with R Program. 3 credits, 3 contact hours.

Prerequisites: Entry-level courses in programming, probability and statistics (e.g MATH333, CS280, or equivalent courses with permission of the instructor). This course teaches data analytics with R programming. The student will learn and gain basic analytic skills via this high-level language. The course covers fundamental knowledge in R programming. Popular R packages for data science will be introduced as working examples. The course also includes case studies on data analytics projects. As a core course in data science, it provides skills that are highly desirable for both industry and academic employers.

DS 642. Applications of Parallel Computing. 3 credits, 3 contact hours.

Prerequisites: Proficiency in (non-parallel) programming in a high level procedural language. This course will teach students how to design, analyze, and implement, parallel programs for high performance computational science and engineering applications. The course focuses on advanced computer architectures, parallel algorithms, parallel languages, and performance-oriented computing. Students will develop knowledge and skills to efficiently solve challenging problems in science and engineering, where very fast computers are required either to perform complex simulations or to analyze enormous datasets.

DS 644. Introduction to Big Data. 3 credits, 3 contact hours.

Prerequisites: Permission of the instructor. This course provides an in-depth coverage of various topics in big data from data generation, storage, management, transfer, to analytics, with focus on the state-of-the-art technologies, tools, architectures, and systems that constitute big-data computing solutions in high-performance networks. Real-life big-data applications and workflows in various domains (particularly in the sciences) are introduced as use cases to illustrate the development, deployment, and execution of a wide spectrum of emerging big-data solutions.

DS 650. Data Visualization and Interpretation. 3 credits, 3 contact hours.

The course will focus on training students with the knowledge of data visualization theory, techniques, and tools. Students will learn why and how visualization can be applied in the human-centered data science pipeline and the different uses of visualization, such as in exploratory data analysis, and in the communication of data-driven insights. They will gain practical experience in interpreting, critiquing, and comparing visualization techniques by using real-world data sets and case studies. Students will also develop interactive visualization interfaces as part of the class project. They will gain a broad understanding of how visualization can enhance trust and interpretation of machine learning models. The students will read and learn about recent progress in the areas of information visualization, visual analytics, and human-data interaction.

DS 669. Reinforcement Learning. 3 credits, 3 contact hours.

Prerequisites: Linear algebra, basic probability, basic calculus, computer programming, or approval of instructor. Experience with machine learning, artificial intelligence, or deep learning (e.g., CS 675, CS 670, CS 677) is recommended. This course covers current topics, key concepts, and classic and modern algorithms in reinforcement learning and contains both theory and applications. The topics include but are not limited to, Markov Decision Processes, exploration and exploitation, planning, value-based learning, policy gradient, etc. Students will present recent papers in reinforcement learning, and work on written and programming assignments and do a reinforcement learning project. After completing this course, students will be able to start using reinforcement learning for real world problems that can be specified as Markov Decision Processes.

DS 675. Machine Learning. 3 credits, 3 contact hours.

Prerequisites: Basic probability, linear algebra, computer programming, and graduate or undergraduate senior standing, OR approval of instructor. This course is an introduction to machine learning and contains both theory and applications. Students will get exposure to a broad range of machine learning methods and hands on practice on real data. Topics include Bayesian classification, perceptron, neural networks, logistic regression, support vector machines, decision trees, random forests, boosting, dimensionality reduction, unsupervised learning, regression, and learning new feature spaces. There will be several programming assignments, one course project, one mid-term and one final exam.

DS 677. Deep Learning. 3 credits, 3 contact hours.

Prerequisites: DS 675 or approval of the instructor. Restrictions: None. This course covers current topics in data science. The topics include but are not limited to parallel programming on GPU and CPU multi-cores, deep learning, representation learning, optimization algorithms, and algorithms for big datasets. Students will present recent papers in data science, work on programming assignments, and do a machine learning/deep learning/data science project.

DS 680. Natural Language Processing. 3 credits, 3 contact hours.

Prerequisites: DS 675 or DS 677 and instructor's approval. This course aims to teach how to process one of the fundamental data sources—natural language—with the help of deep learning techniques. The target of this course is to familiarize students with state-of-the-art language models, wide variety of tasks performed with these models and the fusion of these in deep learning architectures. This course will help students read advanced research papers on complex NLP concepts and theories, while the class project will help them apply NLP techniques to different domains.

DS 725. Independent Study I. 3 credits, 3 contact hours.

Approval of the academic advisor is required for registration. Students working on their PhD dissertation cannot register for both DS 725 and DS 726 with the same faculty. This special course covers areas of study in which one or more students may be interested but there is not sufficiently broad interest to warrant a regular course offering. Students may not register for this course more than once.

DS 726. Independent Study II. 3 credits, 3 contact hours.

Approval of the academic advisor is required for registration. Students working on their PhD dissertation cannot register for both DS 725 and CS 726 with the same faculty. This special course covers areas of study in which one or more students may be interested but there is not sufficiently broad interest to warrant a regular course offering. Students may not register for this course more than once.

DS 786. Selected Topics in Data Science. 3 credits, 3 contact hours.

Prerequisites: As determined by nature of topic area. Introduction to selected topics in data science.

DS 789. Trustworthy Artificial Intelligence. 3 credits, 3 contact hours.

Prerequisites: DS 675 or approval of instructor. As machine learning (ML) systems are increasingly being deployed in real-world applications, it is critical to ensure that these systems are behaving responsibly and are trustworthy. That will lead to wider adoption of ML in real-world applications in practice. This course will provide a deep understanding of state-of-the-art ML methods designed to make AI more trustworthy to unforeseen faults, adversarial manipulation, and to violations of ethical norms in privacy and fairness. Students will gain an understanding of and experience in using a set of methods and tools for deploying transparent, ethically sound, and robust machine learning solutions. The course is also an excellent opportunity to conduct research on the security/privacy/trustworthiness in ML and find research topics for Ph.D. and M.S. theses.

DS 790A. Doctoral Dissertation & Research. 1 credit, 1 contact hour.

Corequisites: DS 791. Approval of the dissertation advisor is required for registration. Experimental and/or theoretical investigation of a relevant topic in data science. For PhD students who have successfully defended their dissertation proposal. The student must register in DS 790A every semester until successful dissertation defense. A written dissertation must be defended and approved by a committee of at least five members.

DS 791. Graduate Seminar. 0 credits, 1 contact hour.

Corequisite (for doctoral students only): DS 790. A seminar in which faculty, students, and invited speakers will present summaries of advanced topics in data science. In the course, students and faculty will discuss research procedures, dissertation organization, and content. Students engaged in research will present their own problems and research progress for discussion and criticism.

DS 792B. Pre-Doctoral Research. 3 credits, 3 contact hours.

Corequisites: DS 791. Approval of the dissertation advisor is required for registration. Preliminary experimental and/or theoretical investigation of a relevant topic in data science. For students who have passed the qualifying examination but have not defended the dissertation proposal. Permission is needed of the academic advisor as well for students who have completed the required coursework but have not passed the qualifying examination.