ARC-TS begins work on new “Great Lakes” cluster to replace Flux

By | Flux, Happenings, HPC, News

Advanced Research Computing – Technology Services (ARC-TS) is starting the process of creating a new, campus-wide computing cluster, “Great Lakes,” that will serve the broad needs of researchers across the University. Over time, Great Lakes will replace Flux, the shared research computing cluster that currently serves over 300 research projects and 2,500 active users.

“Researchers will see improved performance, flexibility and reliability associated with newly purchased hardware, as well as changes in policies that will result in greater efficiencies and ease of use,” said Brock Palen, director of ARC-TS.

The Great Lakes cluster will be available to all researchers on campus for simulation, modeling, machine learning, data science, genomics, and more. The platform will provide a balanced combination of computing power, I/O performance, storage capability, and accelerators.

ARC-TS is in the process of procuring the cluster. Only minimal interruption to ongoing research is expected. A “Beta” cluster will be available to help researchers learn the new system before Great Lakes is deployed in the first half of 2019.

The Flux cluster is approximately 8 years old, although many of the individual nodes are newer. One of the benefits of replacing the cluster is to create a more homogeneous platform.

Based on extensive input from faculty and other stakeholders across campus, the new Great Lakes cluster will be designed to deliver similar services and capabilities as Flux, including the ability to accommodate faculty purchases of hardware, access to GPUs and large-memory nodes, and improved support for emerging uses such as machine learning and genomics. The cluster will consist of approximately 20,000 cores.

For more information, contact hpc-support@umich.edu, and see arc-ts.umich.edu/systems-services/greatlakes, where updates to the project will be posted.

Interdisciplinary Committee on Organizational Studies (ICOS) Big Data Summer Camp, May 14-18

By | Data, Educational, General Interest, Happenings, News
Social and organizational life are increasingly conducted online through electronic media, from emails to Twitter feed to dating sites to GPS phone tracking. The traces these activities leave behind have acquired the (misleading) title of “big data.” Within a few years, a standard part of graduate training in the social sciences will include a hefty dose of “using of big data,” and we will all be utilizing terms like API and Python.
This year ICOS, MIDAS, and ARC are again offering a one-week “big data summer camp” for doctoral students interested in organizational research, with a combination of detailed examples from researchers; hands-on instruction in Python, SQL, and APIs; and group work to apply these ideas to organizational questions.  Enrollment is free, but students must commit to attending all day for each day of camp, and be willing to work in interdisciplinary groups.

The dates of the camp are all day May 14th-18th.

U-M launches Data Science Master’s Program

By | Educational, General Interest, Happenings, News

The University of Michigan’s new, interdisciplinary Data Science Master’s Program is taking applications for its first group of students. The program is aimed at teaching participants how to extract useful knowledge from massive datasets using computational and statistical techniques.

The program is a collaboration between the College of Engineering (EECS), the College of Literature Science and the Arts (Statistics), the School of Public Health (Biostatistics), the School of Information, and the Michigan Institute for Data Science.

“We are very excited to be offering this unique collaborative program, which brings together expertise from four key disciplines at the University in a curriculum that is at the forefront of data science,” said HV Jagadish, Bernard A. Galler Collegiate Professor of Electrical Engineering and Computer Science, who chairs the program committee for the program.

“MIDAS was a catalyst in bringing  faculty from multiple disciplines together to work towards the development of this new degree program,”  he added.

MIDAS will provide students in this program with interdisciplinary collaborations, intellectual stimulation, exposure to a broad range of practice, networking opportunities, and space on Central Campus to meet for formal and informal gatherings.

For more information, see the program website at https://lsa.umich.edu/stats/masters_students/mastersprograms/data-science-masters-program.html, and the program guide (PDF) at https://lsa.umich.edu/content/dam/stats-assets/StatsPDF/MSDS-Program-Guide.pdf.

Applications are due March 15.

HPC training workshops begin Tuesday, Feb. 13

By | Educational, Events, General Interest, Happenings, HPC, News

series of training workshops in high performance computing will be held Feb. 12 through March 6, 2018, presented by CSCAR in conjunction with Advanced Research Computing – Technology Services (ARC-TS).

Introduction to the Linux command Line
This course will familiarize the student with the basics of accessing and interacting with Linux computers using the GNU/Linux operating system’s Bash shell, also known as the “command line.”
Location: East Hall, Room B254, 530 Church St.
Dates: (Please sign up for only one)
• Tuesday, Feb. 13, 1 – 4 p.m. (full descriptionregistration)
• Friday, Feb. 16, 9 a.m. – noon (full description | registration)

Introduction to the Flux cluster and batch computing
This workshop will provide a brief overview of the components of the Flux cluster, including the resource manager and scheduler, and will offer students hands-on experience.
Location: East Hall, Room B254, 530 Church St.
Dates: (Please sign up for only one)
• Monday, Feb. 19, 1 – 4 p.m. (full description | registration)
• Tuesday, March 6, 1 – 4 p.m. (full description | registration)

Advanced batch computing on the Flux cluster
This course will cover advanced areas of cluster computing on the Flux cluster, including common parallel programming models, dependent and array scheduling, and a brief introduction to scientific computing with Python, among other topics.
Location: East Hall, Room B250, 530 Church St.
Dates: (Please sign up for only one)
• Wednesday, Feb. 21, 1 – 5 p.m. (full description | registration)
• Friday, Feb. 23, 1 – 5 p.m. (full description | registration)

Hadoop and Spark workshop
Learn how to process large amounts (up to terabytes) of data using SQL and/or simple programming models available in Python, R, Scala, and Java.
Location: East Hall, Room B250, 530 Church St.
Dates: (Please sign up for only one)
• Thursday, Feb. 22, 1 – 5 p.m. (full description | registration)

Video available from MIDAS Research Forum

By | General Interest, Happenings, News, Research

Video is now available from the MIDAS Research Forum held Dec. 1 in the Michigan League at http://myumi.ch/6vA3V

The forum featured U-M students and faculty showcasing their data science research; a workshop on how to work with industry; presentations from student groups; and a summary of the data science consulting and infrastructure services available to the U-M research community.

NOTE: The keynote presentation from Christopher Rozell of the Georgia Institute of Technology will be available in the near future.

U-M wraps up successful SC17 conference

By | General Interest, Happenings, HPC, News

Several University of Michigan researchers and professional IT staff attended the Supercomputing 17 (SC17) conference in Denver from Nov. 12-17, participating in a number of different ways, including demonstrations, presentations and tutorials.

U-M participation included:

  • Matt McLean, a Big Data systems administrator with ARC-TS, served as a panelist at a session titled “The ARM Software Ecosystem: Are We There Yet?” (Slides)
  • Jeff Sica, a research database administrator with ARC-TS, helped lead a Birds of a Feather session titled “Containers in HPC.” (Slides)
  • Quentin Stout (EECS) and Christiane Jablonowski (CLASP) taught the “Parallel Computing 101” tutorial.
  • Shawn McKee, U-M Department of Physics, and OSiRIS Principal Investigator, demonstrated Object Storage and Caching for Science (network topology diagrams)
  • Eric Boyd, Director of Research Networks, presented on Research Networking at the University of Michigan at the U-M exhibit booth.
  • Simon Adorf, Ph.D. Candidate, Chemical Engineering Department, U-M, presented on Simple Data and Workflow Management with Signac and GPU-Accelerated Predictive Material Design at the U-M exhibit booth.
  • ARC sponsored a networking and career networking reception put on by Women in HPC. ARC Director Sharon Broude Geva spoke at the event.
  • Amy Liebowitz, a network architect at ITS, worked on SCINet, a high-capacity network created every year for the conference. Liebowitz was on the routing team, which is responsible for installing, configuring and supporting the high performance conference network. The Routing Team also coordinated external connectivity with commodity Internet and R&E WAN service providers.

Reading and discussion group:  Data science in understanding and addressing climate change 

By | Educational, Events, General Interest, Happenings

CSCAR announces a reading and discussion group Data science in understanding and addressing climate change that will meet on the third or fourth (depending on the preferences of participants) Friday of every month between 3 and 5 pm. We will discuss reports and significant papers that illuminate fundamental issues in climate change science, policy, and management. The suggested format at this stage is that we discuss one science and one policy (or management) paper or chapter. The focus will be on the spatial (and temporal) dimensions of the issue and we will concentrate more on methods and techniques keeping the requirement for domain knowledge relatively low. We will lay emphasis on the conceptual part of the tools and techniques so that it is accessible to a wider set of participants, but will also get into the technical details.

This is an effort to bring people involved in climate change together from a data science perspective. The idea is to learn together in a fun environment and foster dialogue with a focus on how data science can provide the common ground for mutual learning and understanding.

 We will meet in Rackham, but we will be open to rotating the location. You will be able to participate remotely, if you choose to.

 If you are interested send an email to Manish Verma at manishve@umich.edu

 If you have any suggestion for discussion and reading let us know.  We will include chapters from the IPCC and US global change science programs in our discussion.

U-M partners with Cavium on Big Data computing platform

By | Feature, General Interest, Happenings, HPC, News

A new partnership between the University of Michigan and Cavium Inc., a San Jose-based provider of semiconductor products, will create a powerful new Big Data computing cluster available to all U-M researchers.

The $3.5 million ThunderX computing cluster will enable U-M researchers to, for example, process massive amounts of data generated by remote sensors in distributed manufacturing environments, or by test fleets of automated and connected vehicles.

The cluster will run the Hortonworks Data Platform providing Spark, Hadoop MapReduce and other tools for large-scale data processing.

“U-M scientists are conducting groundbreaking research in Big Data already, in areas like connected and automated transportation, learning analytics, precision medicine and social science. This partnership with Cavium will accelerate the pace of data-driven research and opening up new avenues of inquiry,” said Eric Michielssen, U-M associate vice president for advanced research computing and the Louise Ganiard Johnson Professor of Engineering in the Department of Electrical Engineering and Computer Science.

“I know from experience that U-M researchers are capable of amazing discoveries. Cavium is honored to help break new ground in Big Data research at one of the top universities in the world,” said Cavium founder and CEO Syed Ali, who received a master of science in electrical engineering from U-M in 1981.

Cavium Inc. is a leading provider of semiconductor products that enable secure and intelligent processing for enterprise, data center, wired and wireless networking. The new U-M system will use dual socket servers powered by Cavium’s ThunderX ARMv8-A workload optimized processors.

The ThunderX product family is Cavium’s 64-bit ARMv8-A server processor for next generation Data Center and Cloud applications, and features high performance custom cores, single and dual socket configurations, high memory bandwidth and large memory capacity.

Alec Gallimore, the Robert J. Vlasic Dean of Engineering at U-M, said the Cavium partnership represents a milestone in the development of the College of Engineering and the university.

“It is clear that the ability to rapidly gain insights into vast amounts of data is key to the next wave of engineering and science breakthroughs. Without a doubt, the Cavium platform will allow our faculty and researchers to harness the power of Big Data, both in the classroom and in their research,” said Gallimore, who is also the Richard F. and Eleanor A. Towner Professor, an Arthur F. Thurnau Professor, and a professor both of aerospace engineering and of applied physics.

Along with applications in fields like manufacturing and transportation, the platform will enable researchers in the social, health and information sciences to more easily mine large, structured and unstructured datasets. This will eventually allow, for example, researchers to discover correlations between health outcomes and disease outbreaks with information derived from socioeconomic, geospatial and environmental data streams.

U-M and Cavium chose to run the cluster on Hortonworks Data Platform, which is based on open source Apache Hadoop. The ThunderX cluster will deliver high performance computer services for the Hadoop analytics and, ultimately, a total of three petabytes of storage space.

“Hortonworks is excited to be a part of forward-leading research at the University of Michigan exploring low-powered, high-performance computing,” said Nadeem Asghar, vice president and global head of technical alliances at Hortonworks. “We see this as a great opportunity to further expand the platform and segment enablement for Hortonworks and the ARM community.”

Info session: Consulting and computing resources for data science — Nov. 8

By | Data, Educational, Events, General Interest, Happenings, HPC

Advanced Research Computing at U-M (ARC) will host an information session for graduate students in all disciplines who are interested in new computing and data science resources and services available to U-M researchers.

Brief presentations from members of ARC Technology Services (ARC-TS) on computing infrastructure, and from Consulting for Statistics, Computing, and Analytics Research (CSCAR) on statistics, data science, and computing training and consulting will be followed by a Q&A session, and opportunities to interact individually with ARC and CSCAR staff.

ARC and CSCAR are interested in connecting with graduate students whose research would benefit from customized or innovative computational or analytic approaches, and can provide guidance for students aiming to do this. ARC and CSCAR are also interested in developing training and documentation materials for a diverse range of application areas, and would welcome input from student researchers on opportunities to tailor our training offerings to new areas.

Speakers:

  • Kerby Shedden, Director, CSCAR
  • Brock Palen, Director, ARC-TS

Date/Time/Location:

Wednesday, Nov. 8, 2017, 2 – 4 p.m., West Conference Room, 4th Floor, Rackham Building (915 E. Washington St.)

Add to Google Calendar