Video available from MIDAS Research Forum

By | General Interest, Happenings, News, Research

Video is now available from the MIDAS Research Forum held Dec. 1 in the Michigan League at http://myumi.ch/6vA3V

The forum featured U-M students and faculty showcasing their data science research; a workshop on how to work with industry; presentations from student groups; and a summary of the data science consulting and infrastructure services available to the U-M research community.

NOTE: The keynote presentation from Christopher Rozell of the Georgia Institute of Technology will be available in the near future.

Yottabyte Research Cloud able to accept HIPAA-aligned data

By | General Interest, HPC, News

Advanced Research Computing – Technology Services (ARC-TS) is pleased to announce that the Yottabyte Research Cloud (YBRC) computing platform is now HIPAA-compliant. This means that YBRC and its associated services can accept restricted data, enabling secure data analysis on Windows and Linux virtual desktops as well as secure hosting of databases and data ingestion.

The new capability ensures the security of restricted data through the creation of firewalled network enclaves, allowing HIPAA-aligned data to be analyzed safely and securely in YBRC’s flexible, robust and scalable environment.   Within each network enclave, researchers have access to Windows and Linux virtual desktops that can contain any software required for their analysis pipeline.

This capability also extends to our database and ingestion services:

  • Structured databases:  MySQL/MariaDB, and PostgreSQL.
  • Unstructured databases: Cassandra, MongoDB, InfluxDB, Grafana, and ElasticSearch.
  • Data ingestion: Redis, Kafka, RabbitMQ.
  • Data processing: Apache Flink, Apache Storm, Node.js and Apache NiFi.
  • Other data services are available upon request.

YBRC is supported by U-M’s Data Science Initiative launched in 2015. YBRC was created through a partnership between Yottabyte and ARC-TS announced last fall.

These tools are offered to all researchers at the University of Michigan free of charge, provided that certain usage restrictions are not exceeded. Large-scale users who outgrow the no-cost allotment may purchase additional YBRC resources. All interested parties should contact hpc-support@umich.edu.

ARC Director Sharon Broude Geva re-elected vice-chair of Coalition for Academic Scientific Computing

By | General Interest, News

Sharon Broude Geva, the Director of Advanced Research Computing at the University of Michigan, has been re-elected vice-chair of the Coalition for Academic Scientific Computation (CASC).

Founded in 1989, CASC advocates for the use of advanced computing technology to accelerate scientific discovery for national competitiveness, global security, and economic success. The organization’s members represent 84 institutions of higher education and national labs.

The vice-chair position is one of four elected CASC executive officers. The officers work closely as a team with the director of CASC. The vice-chair also leads CASC meeting program committees, is responsible for recruitment of new members, substitutes for the chair in his or her absences, and assists with moderating CASC meetings.

Geva served as CASC secretary in 2015 and 2016, and one term as vice-chair in 2017. Her next term as vice-chair is effective for the 2018 calendar year.

The other executive officers for 2017 are are Rajendra Bose, Chair, Columbia University; Neil Bright, Secretary, Georgia Institute of Technology; and Andrew Sherman, Treasurer, Yale University. Curt Hillegas of Princeton University is immediate past chair.

The 2018 CASC brochure is available online.

U-M wraps up successful SC17 conference

By | General Interest, Happenings, HPC, News

Several University of Michigan researchers and professional IT staff attended the Supercomputing 17 (SC17) conference in Denver from Nov. 12-17, participating in a number of different ways, including demonstrations, presentations and tutorials.

U-M participation included:

  • Matt McLean, a Big Data systems administrator with ARC-TS, served as a panelist at a session titled “The ARM Software Ecosystem: Are We There Yet?” (Slides)
  • Jeff Sica, a research database administrator with ARC-TS, helped lead a Birds of a Feather session titled “Containers in HPC.” (Slides)
  • Quentin Stout (EECS) and Christiane Jablonowski (CLASP) taught the “Parallel Computing 101” tutorial.
  • Shawn McKee, U-M Department of Physics, and OSiRIS Principal Investigator, demonstrated Object Storage and Caching for Science (network topology diagrams)
  • Eric Boyd, Director of Research Networks, presented on Research Networking at the University of Michigan at the U-M exhibit booth.
  • Simon Adorf, Ph.D. Candidate, Chemical Engineering Department, U-M, presented on Simple Data and Workflow Management with Signac and GPU-Accelerated Predictive Material Design at the U-M exhibit booth.
  • ARC sponsored a networking and career networking reception put on by Women in HPC. ARC Director Sharon Broude Geva spoke at the event.
  • Amy Liebowitz, a network architect at ITS, worked on SCINet, a high-capacity network created every year for the conference. Liebowitz was on the routing team, which is responsible for installing, configuring and supporting the high performance conference network. The Routing Team also coordinated external connectivity with commodity Internet and R&E WAN service providers.

U-M partners with Cavium on Big Data computing platform

By | Feature, General Interest, Happenings, HPC, News

A new partnership between the University of Michigan and Cavium Inc., a San Jose-based provider of semiconductor products, will create a powerful new Big Data computing cluster available to all U-M researchers.

The $3.5 million ThunderX computing cluster will enable U-M researchers to, for example, process massive amounts of data generated by remote sensors in distributed manufacturing environments, or by test fleets of automated and connected vehicles.

The cluster will run the Hortonworks Data Platform providing Spark, Hadoop MapReduce and other tools for large-scale data processing.

“U-M scientists are conducting groundbreaking research in Big Data already, in areas like connected and automated transportation, learning analytics, precision medicine and social science. This partnership with Cavium will accelerate the pace of data-driven research and opening up new avenues of inquiry,” said Eric Michielssen, U-M associate vice president for advanced research computing and the Louise Ganiard Johnson Professor of Engineering in the Department of Electrical Engineering and Computer Science.

“I know from experience that U-M researchers are capable of amazing discoveries. Cavium is honored to help break new ground in Big Data research at one of the top universities in the world,” said Cavium founder and CEO Syed Ali, who received a master of science in electrical engineering from U-M in 1981.

Cavium Inc. is a leading provider of semiconductor products that enable secure and intelligent processing for enterprise, data center, wired and wireless networking. The new U-M system will use dual socket servers powered by Cavium’s ThunderX ARMv8-A workload optimized processors.

The ThunderX product family is Cavium’s 64-bit ARMv8-A server processor for next generation Data Center and Cloud applications, and features high performance custom cores, single and dual socket configurations, high memory bandwidth and large memory capacity.

Alec Gallimore, the Robert J. Vlasic Dean of Engineering at U-M, said the Cavium partnership represents a milestone in the development of the College of Engineering and the university.

“It is clear that the ability to rapidly gain insights into vast amounts of data is key to the next wave of engineering and science breakthroughs. Without a doubt, the Cavium platform will allow our faculty and researchers to harness the power of Big Data, both in the classroom and in their research,” said Gallimore, who is also the Richard F. and Eleanor A. Towner Professor, an Arthur F. Thurnau Professor, and a professor both of aerospace engineering and of applied physics.

Along with applications in fields like manufacturing and transportation, the platform will enable researchers in the social, health and information sciences to more easily mine large, structured and unstructured datasets. This will eventually allow, for example, researchers to discover correlations between health outcomes and disease outbreaks with information derived from socioeconomic, geospatial and environmental data streams.

U-M and Cavium chose to run the cluster on Hortonworks Data Platform, which is based on open source Apache Hadoop. The ThunderX cluster will deliver high performance computer services for the Hadoop analytics and, ultimately, a total of three petabytes of storage space.

“Hortonworks is excited to be a part of forward-leading research at the University of Michigan exploring low-powered, high-performance computing,” said Nadeem Asghar, vice president and global head of technical alliances at Hortonworks. “We see this as a great opportunity to further expand the platform and segment enablement for Hortonworks and the ARM community.”

CSCAR provides walk-in support for new Flux users

By | Data, Educational, Flux, General Interest, HPC, News

CSCAR now provides walk-in support during business hours for students, faculty, and staff seeking assistance in getting started with the Flux computing environment.  CSCAR consultants can walk a researcher through the steps of applying for a Flux account, installing and configuring a terminal client, connecting to Flux, basic SSH and Unix command line, and obtaining or accessing allocations.  

In addition to walk-in support, CSCAR has several staff consultants with expertise in advanced and high performance computing who can work with clients on a variety of topics such as installing, optimizing, and profiling code.  

Support via email is also provided via hpc-support@umich.edu.  

CSCAR is located in room 3550 of the Rackham Building (915 E. Washington St.). Walk-in hours are from 9 a.m. – 5 p.m., Monday through Friday, except for noon – 1 p.m. on Tuesdays.

See the CSCAR web site (cscar.research.umich.edu) for more information.

University of Michigan researcher contributes to NASA findings on carbon in the atmosphere showcased in the journal Science

By | General Interest, Happenings, News

 

High-resolution satellite data from NASA’s Orbiting Carbon Observatory-2 are revealing the subtle ways that carbon links everything on Earth – the ocean, land, atmosphere, terrestrial ecosystems and human activities. Scientists using the first 2 1/2 years of OCO-2 data have published a special collection of five papers today in the journal Science that demonstrates the breadth of this research. In addition to showing how drought and heat in tropical forests affected global carbon dioxide levels during the 2015-16 El Niño, other results from these papers focus on ocean carbon release and absorption, urban emissions and a new way to study photosynthesis. A final paper by OCO-2 Deputy Project Scientist Annmarie Eldering of NASA’s Jet Propulsion Laboratory in Pasadena, California, and colleagues gives an overview of the state of OCO-2 science.

Manish Verma, a Geospatial/Data Science Consultant at the University of Michigan’s Consulting for Statistics, Computing and Analytics Research (CSCAR) unit, contributed as a coauthor to an article on a new way to measure photosynthesis over time and space.

Using data from the OCO-2, Verma’s analysis helped expand the utility of measurements of solar induced fluorescence (SIF), which indicates active photosynthesis in plants. Verma’s work showed that SIF data collected from the OCO-2 satellite provides reliable information on the variability of photosynthesis at a much smaller scale — down to individual ecosystems.

This can, in turn, “lead to more reliable estimates of carbon sources — that is, when, where, why and how carbon is exchanged between land and atmosphere — as well as a deeper understanding of carbon-climate feedbacks,” according to the Science article.

For more, see the NASA press release (https://www.nasa.gov/feature/jpl/new-insights-from-oco-2-showcased-in-science) and the Science article (http://science.sciencemag.org/content/358/6360/eaam5747.full)

Real estate dataset available to researchers

By | Data, Data sets, Educational, General Interest, Happenings, News

The University of Michigan Library system and the Data Acquisition for Data Sciences program (DADS) of the U-M Data Science Initiative (DSI) have recently joined forces to license a major data resource capturing parcel-level information about the property market in the United States.  

The data were licensed from the Corelogic corporation, who have assimilated deed, tax and foreclosure information on nearly all properties in the entire US. Coverage dates vary by county, some county records go back fifty years. Coverage is more comprehensive from the 1990s to the present.

These data will support a variety of research efforts into regional economies, economic disparities, trends in land-use, housing market dynamics, and urban ecology, among many other areas.

The data are available on the Turbo Research Storage system for users of the U-M High Performance Computing infrastructure, and via the University of Michigan Library.

To access the data, researchers must first sign a MOU; contact Senior Associate Librarian Catherine Morse cmorse@umich.edu for more information, or visit https://www.lib.umich.edu/database/corelogic-parcel-level-real-estate-data.

HPC training workshops begin Thursday, Sept. 21

By | Educational, Events, General Interest, HPC, News

series of training workshops in high performance computing will be held Sept. 21 through Oct. 31, 2017, presented by CSCAR in conjunction with Advanced Research Computing – Technology Services (ARC-TS). All sessions are held at East Hall, Room B254, 530 Church St.

Introduction to the Linux command Line
This course will familiarize the student with the basics of accessing and interacting with Linux computers using the GNU/Linux operating system’s Bash shell, also known as the “command line.”
Dates: (Please sign up for only one)
• Thursday, Sept. 21, 9 a.m. – noon (full descriptionregistration)
• Thursday, Sept. 28, 9 a.m. – noon (full description | registration)
Location:
East Hall, Room B250, 530 Church St.

Introduction to the Flux cluster and batch computing
This workshop will provide a brief overview of the components of the Flux cluster, including the resource manager and scheduler, and will offer students hands-on experience.
Dates: (Please sign up for only one)
• Thursday, Sept. 28, 1 – 4 p.m. (full description | registration)
• Monday, Oct. 2, 9 a.m. – noon (full description | registration)
Location:
East Hall, Room B254, 530 Church St.

Advanced batch computing on the Flux cluster
This course will cover advanced areas of cluster computing on the Flux cluster, including common parallel programming models, dependent and array scheduling, among other topics.
Dates: (Please sign up for only one)
• Tuesday, Oct. 10, 1 – 5 p.m. (full description | registration) Location: East Hall, Room B254, 530 Church St.
• Thursday, Oct. 12, 9 a.m. – noon (full description | registration) Location: East Hall, Room B254, 530 Church St.
• Friday, Oct. 13, 9 a.m. – noon (full description | registration) Location: East Hall, Room B250, 530 Church St.

Hadoop Workshop
Learn how to process large amounts (up to terabytes) of data using SQL and/or simple programming models available in Python, Scala, and Java.
Date:
• Tuesday, Oct. 31, 1 – 5 p.m. (full description | registration)
Location:
East Hall, Room B254, 530 Church St.

SAVE THE DATE: MIDAS Annual Symposium, Oct. 11

By | Events, General Interest, News

Please join us for the 2017 Michigan Institute for Data Science Symposium.

The keynote speaker will be Cathy O’Neil, mathematician and best-selling author of “Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy.”

Other speakers include:

  • Nadya Bliss, Director of the Global Security Initiative, Arizona State University
  • Francesca Dominici, Co-Director of the Data Science Initiative and Professor of Biostatistics, Harvard T.H. Chan School of Public Health
  • Daniela Whitten, Associate Professor of Statistics and Biostatistics, University of Washington
  • James Pennebaker, Professor of Psychology, University of Texas

More details, including how to register, will be available soon.