Explore ARCExplore ARC

Understanding How the Brain Processes Music Through the Bach Trio Sonatas

By |

This event is open to the public.

Daniel Forger, Professor of Mathematics and Computational Medicine and Bioinformatics
James Kibbie, Professor of Music and Chair of the Organ Department, University Organist
Caleb Mayer, Graduate Student Research Assistant (Mathematics)
Sarah Simko, Graduate Student Research Assistant (Organ Performance)

With support from the Data Science for Music Challenge Initiative through MIDAS, the team is taking a big data approach to understanding the patterns and principles of music. The project is developing and analyzing a library of digitized performances of the Trio Sonatas for organ by Johann Sebastian Bach, applying novel algorithms to study the music structure from a data science perspective. Organ students from the School of Music, Theatre & Dance will demonstrate how the Frieze Memorial Organ in Hill Auditorium is used to create big data files of live performances. The team will discuss how its analysis compares different performances to determine features that make performances artistic, as well as the common mistakes performers make. The digitized performances will be shared with researchers and will enable research and pedagogy in many disciplines, including data science, music performance, mathematics and music psychology.

ARC-TS joins Cloud Native Computing Foundation

By | General Interest, Happenings, News

Advanced Research Computing – Technology Services (ARC-TS) at the University of Michigan has become the first U.S. academic institution to join the Cloud Native Computing Foundation (CNCF), a foundation that advances the development and use of cloud native applications and services. Founded in 2015, CNCF is part of the Linux Foundation.

CNCF announced ARC-TS’s membership at the KubeCon and CloudNativeCon event in Copenhagen. A video of the opening remarks by CNCF Executive Director Dan Kohn can be viewed on the event website.

“Our membership in the CNCF signals our commitment to bringing cloud computing and containers technology to researchers across campus,” said Brock Palen, Director of ARC-TS. “Kubernetes and other CNCF platforms are becoming crucial tools for advanced machine learning, pipelining, and other research methods. We also look forward to bring an academic perspective to the foundation.”

ARC-TS’s membership and participation in the group signals its adoption and commitment to cloud-native technologies and practices. Users of containers and other CNCF services will have access to experts in the field.

Membership gives the U-M research community input into in the continuing development of cloud-native applications, and within CNCF-managed and ancillary projects. U-M is the second academic institution to join the foundation, and the only one in the U.S.

U-M launches Data Science Master’s Program

By | Educational, General Interest, Happenings, News

The University of Michigan’s new, interdisciplinary Data Science Master’s Program is taking applications for its first group of students. The program is aimed at teaching participants how to extract useful knowledge from massive datasets using computational and statistical techniques.

The program is a collaboration between the College of Engineering (EECS), the College of Literature Science and the Arts (Statistics), the School of Public Health (Biostatistics), the School of Information, and the Michigan Institute for Data Science.

“We are very excited to be offering this unique collaborative program, which brings together expertise from four key disciplines at the University in a curriculum that is at the forefront of data science,” said HV Jagadish, Bernard A. Galler Collegiate Professor of Electrical Engineering and Computer Science, who chairs the program committee for the program.

“MIDAS was a catalyst in bringing  faculty from multiple disciplines together to work towards the development of this new degree program,”  he added.

MIDAS will provide students in this program with interdisciplinary collaborations, intellectual stimulation, exposure to a broad range of practice, networking opportunities, and space on Central Campus to meet for formal and informal gatherings.

For more information, see the program website at https://lsa.umich.edu/stats/masters_students/mastersprograms/data-science-masters-program.html, and the program guide (PDF) at https://lsa.umich.edu/content/dam/stats-assets/StatsPDF/MSDS-Program-Guide.pdf.

Applications are due March 15.

Hadoop and Spark Workshop

By |

Overview

Learn how to process large amounts (up to terabytes) of data using SQL and/or simple programming models available in Python, R, Scala, and Java. Computers will be provided to follow along with hands-on examples; users can also bring laptops.

Prerequisites

Intro to the Linux Command Line or equivalent. This course assumes familiarity with the Linux command line.

A user account on Flux. If you do not have a Flux user account, click here to go to the account application page at: https://arc-ts.umich.edu/fluxform/

Duo authentication.

Duo two-factor authentication is required to log in to the cluster. When logging in, you will need to type your UMICH password as well as authenticate through Duo in order to access Flux.

If you need to enroll in Duo, follow the instructions at Getting Started: How to Enroll in Duo.

click here to register

Instructor

Brock Palen
Director
ARC-TS

Brock has over 10 years of high performance computing and data intensive computing experience in an academic environment. He currently works with the team at ARC-TS to provide HPC, Data Science, storage, and other research computing services to the University. Brock also is the NSF XSEDE projects Campus Champion representing the schools to this and other national computing infrastructures and organizations.

Materials

Course Preparation

In order to participate successfully in the class exercises, you must have a Flux user account. The user account allows you to log in to the cluster, create, compile, and test applications, and transfer data into Hadoop’s filesystem for processing.

Flux user account

A single Flux user account can be used to prepare and submit jobs using various allocations. If you already already possess a user account, you can use it for this course, you can skip to “Flux allocation” below. If not, please visit https://arc-ts.umich.edu/fluxform to obtain one. A user account is free to members of the University community. Please note that obtaining an account requires human processing, so be sure to do this at least two business days before class begins.

Duo Authentication

Duo two-factor authentication is required to log in to the cluster. When logging in, you will need to type your UMICH password as well as authenticate through Duo in order to access Flux.

If you need to enroll in Duo, follow the instructions at Getting Started: How to Enroll in Duo.

More help

Please email hpc-support@umich.edu for questions, comments, or to seek further assistance.

Data Science Certificate Info Session

By |

DS Cert program info session on 2/16 in room 1180 at the Duderstadt Building from 5:30pm to 6:30pm. The

Come learn about the Graduate Certificate in Data Science:

The certificate is focused on developing core proficiencies in data analytics:
1) Modeling — Understanding of core data science principles, assumptions and applications;
2) Technology — Knowledge of basic protocols for data management, processing, computation, information extraction, and visualization;
3) Practice — Hands-on experience with real data, modeling tools, and technology resources.

Data Science with Social Science data: an introduction to Pandas and StatsModels in Python

By |

This workshop introduces participants to Python’s NumPy, Pandas DataFrames, Matplotlib and StatsModels using an advertising dataset. Participants will use these tools to model (OLS) associations between advertising expenditures and product sales in example data. We will start with an introductory explanation of Anaconda and the Jupyter notebook environment (although not required for the participant, the instructor will be using these tools). We will proceed with topics including: reading data files; creation, indexing and slicing of Pandas DataFrames; creation and handling of Matplotlib objects; and creation and interpretation of models using Python’s StatsModels. Although not required, we recommend that participants have a basic knowledge of Python.

Video available from MIDAS Research Forum

By | General Interest, Happenings, News, Research

Video is now available from the MIDAS Research Forum held Dec. 1 in the Michigan League at http://myumi.ch/6vA3V

The forum featured U-M students and faculty showcasing their data science research; a workshop on how to work with industry; presentations from student groups; and a summary of the data science consulting and infrastructure services available to the U-M research community.

NOTE: The keynote presentation from Christopher Rozell of the Georgia Institute of Technology will be available in the near future.

Info Session: Consulting and computing resources for data science

By |

Advanced Research Computing at U-M (ARC) will host an information session for graduate students in all disciplines who are interested in new computing and data science resources and services available to U-M researchers.

Brief presentations from members of ARC Technology Services (ARC-TS) on computing infrastructure, and from Consulting for Statistics, Computing, and Analytics Research (CSCAR) on statistics, data science, and computing training and consulting will be followed by a Q&A session, and opportunities to interact individually with ARC and CSCAR staff.

ARC and CSCAR are interested in connecting with graduate students whose research would benefit from customized or innovative computational or analytic approaches, and can provide guidance for students aiming to do this. ARC and CSCAR are also interested in developing training and documentation materials for a diverse range of application areas, and would welcome input from student researchers on opportunities to tailor our training offerings to new areas.

Speakers:

  • Kerby Shedden, Director, CSCAR
  • Brock Palen, Director, ARC-TS

U-M partners with Cavium on Big Data computing platform

By | Feature, General Interest, Happenings, HPC, News

A new partnership between the University of Michigan and Cavium Inc., a San Jose-based provider of semiconductor products, will create a powerful new Big Data computing cluster available to all U-M researchers.

The $3.5 million ThunderX computing cluster will enable U-M researchers to, for example, process massive amounts of data generated by remote sensors in distributed manufacturing environments, or by test fleets of automated and connected vehicles.

The cluster will run the Hortonworks Data Platform providing Spark, Hadoop MapReduce and other tools for large-scale data processing.

“U-M scientists are conducting groundbreaking research in Big Data already, in areas like connected and automated transportation, learning analytics, precision medicine and social science. This partnership with Cavium will accelerate the pace of data-driven research and opening up new avenues of inquiry,” said Eric Michielssen, U-M associate vice president for advanced research computing and the Louise Ganiard Johnson Professor of Engineering in the Department of Electrical Engineering and Computer Science.

“I know from experience that U-M researchers are capable of amazing discoveries. Cavium is honored to help break new ground in Big Data research at one of the top universities in the world,” said Cavium founder and CEO Syed Ali, who received a master of science in electrical engineering from U-M in 1981.

Cavium Inc. is a leading provider of semiconductor products that enable secure and intelligent processing for enterprise, data center, wired and wireless networking. The new U-M system will use dual socket servers powered by Cavium’s ThunderX ARMv8-A workload optimized processors.

The ThunderX product family is Cavium’s 64-bit ARMv8-A server processor for next generation Data Center and Cloud applications, and features high performance custom cores, single and dual socket configurations, high memory bandwidth and large memory capacity.

Alec Gallimore, the Robert J. Vlasic Dean of Engineering at U-M, said the Cavium partnership represents a milestone in the development of the College of Engineering and the university.

“It is clear that the ability to rapidly gain insights into vast amounts of data is key to the next wave of engineering and science breakthroughs. Without a doubt, the Cavium platform will allow our faculty and researchers to harness the power of Big Data, both in the classroom and in their research,” said Gallimore, who is also the Richard F. and Eleanor A. Towner Professor, an Arthur F. Thurnau Professor, and a professor both of aerospace engineering and of applied physics.

Along with applications in fields like manufacturing and transportation, the platform will enable researchers in the social, health and information sciences to more easily mine large, structured and unstructured datasets. This will eventually allow, for example, researchers to discover correlations between health outcomes and disease outbreaks with information derived from socioeconomic, geospatial and environmental data streams.

U-M and Cavium chose to run the cluster on Hortonworks Data Platform, which is based on open source Apache Hadoop. The ThunderX cluster will deliver high performance computer services for the Hadoop analytics and, ultimately, a total of three petabytes of storage space.

“Hortonworks is excited to be a part of forward-leading research at the University of Michigan exploring low-powered, high-performance computing,” said Nadeem Asghar, vice president and global head of technical alliances at Hortonworks. “We see this as a great opportunity to further expand the platform and segment enablement for Hortonworks and the ARM community.”

Info session: Consulting and computing resources for data science — Nov. 8

By | Data, Educational, Events, General Interest, Happenings, HPC

Advanced Research Computing at U-M (ARC) will host an information session for graduate students in all disciplines who are interested in new computing and data science resources and services available to U-M researchers.

Brief presentations from members of ARC Technology Services (ARC-TS) on computing infrastructure, and from Consulting for Statistics, Computing, and Analytics Research (CSCAR) on statistics, data science, and computing training and consulting will be followed by a Q&A session, and opportunities to interact individually with ARC and CSCAR staff.

ARC and CSCAR are interested in connecting with graduate students whose research would benefit from customized or innovative computational or analytic approaches, and can provide guidance for students aiming to do this. ARC and CSCAR are also interested in developing training and documentation materials for a diverse range of application areas, and would welcome input from student researchers on opportunities to tailor our training offerings to new areas.

Speakers:

  • Kerby Shedden, Director, CSCAR
  • Brock Palen, Director, ARC-TS

Date/Time/Location:

Wednesday, Nov. 8, 2017, 2 – 4 p.m., West Conference Room, 4th Floor, Rackham Building (915 E. Washington St.)

Add to Google Calendar