Update on the IBM Lawsuit

Unfortunately, after several months of going through the litigation process, IBM still has not provided any information on what specific code in the OpenLava project infringed IBM intellectual property. IBM alleges that Teraproc used IBM technologies in OpenLava 3.0 and each subsequent release. OpenLava 3.0 was developed in the open manner by using github.com to track all source code changes. OpenLava 3.0 consisted of 3 major features: fairshare, preemption, job message (bpost/bread). Dev...
More

Managing Parallel Jobs in OpenLava

Blaunch – A new, parallel job remote task launcher in OpenLava 4.0 HPC environments are often complex by nature with many moving parts. This is especially true of parallel workloads. Making MPI jobs run reliably and predictably under the control of a workload manager can go a long way toward alleviating a range of potential problems and make the HPC environment more reliable productive. In a perfect world, the process of launching and managing MPI tasks would be consistent across all workl...
More

License Scheduling in OpenLava

In June of 2016 at the annual DAC conference, Teraproc previewed new license scheduling capabilities in OpenLava explaining how EDA licenses could be shared among different users, design teams and projects on the same cluster. With the release of OpenLava 4.0, resource based pre-emption has arrived making OpenLava a much more compelling choice for EDA firms concerned about license management. New Functionality in OpenLava supports not only flexible license sharing on the same cluster, but...
More

What’s New in OpenLava 4.0

OpenLava 4.0 is a significant new release that builds on the scalability improvements in previous releases adding many new features. OpenLava has been enhanced in the following areas: More flexible resource limits Enhanced parallel job management Improved software license management / preemption NUMA features, enhanced processor affinity controls Cluster management enhancements Fairshare scheduling enhancements Support for job groups Below, we provide a high-level overview of...
More

Teraproc OpenLava at DAC 2016

Thanks to all of our wonderful friends and customers for taking the time to stop by and say hello at this years DAC conference in Austin, Texas. It was a highly worthwhile event! Thanks also to the new clients who have adopted OpenLava and are realizing the benefits of flexible, open-source workload management for electronic design environments. In case you missed our announcements and technology demonstrations, or are looking for details about the latest OpenLava release, a few pointers b...
More

OpenLava 3.3 – New Features

Performance and Scalability Enhancements As with previous releases, performance and scalability is a significant focus in OpenLava. OpenLava 3.3 provides significant enhancements enabling large clusters to be responsive even while processing large numbers of jobs. Some specific enhancements in OpenLava 3.3 are described below: Administrators familiar with OpenLava will be aware that the lsb.events file is used to log any changes in status associated with hosts, jobs or queues. Similarly,...
More

Preview of License Optimization in OpenLava Enterprise Edition

Application licenses are precious resources in the design environment. It is always desirable to maintain a high level of license utilization while ensuring that high priority jobs get licenses before low priority jobs. A common approach to managing application licenses with a workload scheduler is to configure and track licenses as resources. Users then specify license requirements when submitting jobs. The scheduler will hold the job in a queue if the resource (license) is not available. This...
More

OpenLava 3.3 – Benchmarking one million jobs on a 100,000 core cluster

  Note to reader: This article supersedes an earlier article on scalability testing featured in October 2015 on the Teraproc blog.  With OpenLava 3.3 scalability has been enhanced significantly. As customers deploy OpenLava in ever larger environments, scalability, throughput and performance become increasingly important. To help meet customer requirements in these areas, OpenLava release 3.3 provides a number of important enhancements: Parallelized job event handling to speed cluster...
More

Configuring OpenLava 3.2 for large clusters

Overview OpenLava 3.2 is more scalable than the previous OpenLava versions. When an OpenLava cluster has over 1,000 job slots, default configurations settings are no longer suitable, and you should tune the OS and OpenLava configuration parameters on the master host. This document discusses how to configure OpenLava for large environments. These recommendations were implemented in a recent test involving a 500 node cluster with 50,000 cores running 1,000,000 jobs of various durations. OpenLava...
More

New Features in OpenLava 3.2

A technical update on OpenLava 3.2 for cluster administrators 1. Scalability improvements 50k+ slots, 500k+ jobs per cluster With OpenLava 3.2, Teraproc is increasing the supported job and slot limits for OpenLava Enterprise Edition. Any workload manager can claim to support large clusters, but what really matters is the ability to drive workload throughput, and keep a large cluster fully utilized - a key point sometimes missed. With OpenLava 3.2, Teraproc are pleased to support...
More

Webinar: What’s new in OpenLava – March 3rd, 2016

OpenLava is an open-source workload manager. Over the past two years, the pace of OpenLava development has been nothing short of amazing. Whether you are using other workload management software or OpenLava you'll want to learn about recent advances in OpenLava. Register now for our Webinar on March 3rd, 2016 at 11:00 AM eastern time. In this free seminar, sponsored by Teraproc Inc. and OpenLava.org, you will learn about the advantages of open-source software, and understand why top-tier g...
More

Meet OpenLava.org’s founder: Dave Bigagli

As anyone who has managed a development effort knows, building quality software takes focus, dedication and collaboration. In open-source development this is especially true. It is often said “it takes a village”, so we thought it would be nice to take a moment to acknowledge the contributions of the respected “mayor” of our own OpenLava.org village, David Bigagli. David was a senior architect at Platform Computing between 1996 and 2010 and has also worked with other leading software firm...
More

OpenLava gets a new WebGUI

I’ve often heard from experienced Linux sysadmins that GUI’s get in the way of doing real work. As someone who doesn’t always know the right commands to use though, a well-designed web interface can be pretty handy. Recently I had a chance to look at the new web interface now offered with OpenLava Enterprise Edition, Teraproc’s commercially supported version of OpenLava. Teraproc continues to make investments in OpenLava, and contribute improvements back to the open-source base (http://github...
More

OpenLava enjoys new momentum

Since OpenLava was first offered as an open source tool almost a decade ago, there have been a little over 6,000 downloads. While accurate metrics are hard to come by, especially in the early years , the OpenLava.org site saw a steady rate of one or two downloads per day not including clones of the sources from GitHub. In the last year, we've devised better ways to gather better metrics, and Teraproc has been tracking free downloads of the compiled RPMs for OpenLava 2.2, 3.0 and now 3,1 relea...
More

Testing OpenLava at Scale

  Note to reader: This result has been superseded by a more recent benchmark on OpenLava 3.3. For the latest results, please see this new 1,000,000 job benchmark instead. To validate the latest OpenLava 3.1 release at scale, Teraproc recently ran a significant benchmark on Teraproc's HPC Cluster-as-a-Service. The benchmark was designed to stress the OpenLava scheduler with a large workload representative of what OpenLava users might run in production. The goals of the benchmark were to de...
More

What’s New in OpenLava 3.1?

A technical update on the OpenLava 3.1 release for cluster administrators 1. Resource requirement enhancements to support multiple application licenses e.g. Availability of License A or License B The resource requirement string in OpenLava 3.1 has been enhanced to support an OR operator. In design and simulation environments, users may have multiple versions of the same software tool where license usage is metered by FlexLM. While some simulations require a particular version of the licen...
More

Ten ways to reduce the cost of your EDA infrastructure

As semiconductor design firms know well, infrastructure for EDA (Electronic Design Automation) is an expensive business. Traditional rules of thumb for IT costs don’t apply when the cost of tools and design talent dwarfs infrastructure costs. When it comes to EDA, productivity and efficiency are jobs one, two and three! If you’re managing a design environment, you’re probably running commercial tools from the likes of Cadence®, Synopsys® and Mentor Graphics. Thorough device simulation is critic...
More

Running MPI Jobs with OpenLava

Introduction OpenLava is an open source workload manager that can schedule both serial and parallel jobs. MPI (Message Passing Interface) is a widely used programming interface in High-Performance Computing (HPC) applications to parallelize the execution of large-scale problems. There are multiple commonly used MPI implementations. This document describes how to run MPI applications with OpenLava. Most MPI implementations support integrations with commonly used workload managers. For the mos...
More

GPU-Accelerated R in the Cloud with Teraproc Cluster-as-a-Service

Analysis of statistical algorithms can generate workloads that run for hours, if not days, tying up a single computer. Many statisticians and data scientists write complex simulations and statistical analysis using the R statistical computing environment. Often these programs have a very long run time. Given the amount of time R programmers can spend waiting for results, it makes sense to take advantage of parallelism in the computation and the available hardware. In a previous post on the Te...
More

Scaling R clusters? AWS Spot Pricing is your new best friend

An elastic infrastructure for distributed R Most of us recall the notion of elasticity from Economics 101. Markets are about supply and demand, and when there is an abundance of supply, prices usually go down. Elasticity is a measure of how responsive one economic variable is to another, and in an elastic market the response is proportionately greater than the change in input. It turns out that cloud pricing, on the margin at least, is pretty elastic. Like bananas in a supermarket, CPU cyc...
More

Why HPC Clusters are like Bananas

Realizing a more cost-efficient infrastructure Most of us recall the notion of elasticity from Economics 101. Markets are about supply and demand, and when there is an abundance of supply, prices usually go down. Elasticity is a measure of how responsive one economic variable is to another, and in an elastic market the response is proportionately greater than the change in input. What does this have to do with HPC or analytic clusters you ask? It turns out that cloud pricing, on the margin a...
More

Accelerating R with multi-node parallelism – Rmpi, BatchJobs and OpenLava

Gord Sissons, Feng Li In a previous blog we showed how we could use the R BatchJobs package with OpenLava to accelerate a single-threaded k-means calculation by breaking the workload into chunks and running  them as serial jobs. R users frequently need to find solutions to parallelize workloads, and while solutions like multicore and socket level parallelism are good for some problems, when it comes to large problems there is nothing like a distributed cluster. The message passing inter...
More

Seeing the Forest and the Trees – a parallel machine learning example

Parallelizing Random Forests in R with BatchJobs and OpenLava By: Gord Sissons and Feng Li In his series of blogs about machine learning, Trevor Stephens focuses on a survival model from the Titanic disaster and provides a tutorial explaining how decision trees tend to over-fit models yielding anomalous predictions. How do we build a better predictive model? The answer as Trevor observes, is to grow a whole forest of decision trees, let the models grow as deep as they will, and let these ...
More

Parallel R with BatchJobs

Parallelizing R with BatchJobs - An example using k-means Gord Sissons, Feng Li Many simulations in R are long running. Analysis of statistical algorithms can generate workloads that run for hours if not days tying up a single computer. Given the amount of time R programmers can spend waiting for results, getting acquainted parallelism makes sense. In this first in a series of blogs, we describe an approach to achieving parallelism in R using BatchJobs, a framework that provides Map, Re...
More

Exploring Fairshare Scheduling in OpenLava

Sharing is good. Whether we're sharing a soda, an apartment or an HPC cluster, chances are good that sharing can save us money. As readers of my previous blog will know, I've doing some playing around with OpenLava. OpenLava is an open source workload manager that is free for use and downloadable from http://openlava.org or http://teraproc.com. One of the new features in OpenLava 3.0 is fairshare scheduling. I know a lot of clients see value in this, so I decided to setup another free clus...
More

Early access for R CaaS

Teraproc announces early registration for our R Cluster-as-a-Service offering. It's the eleventh hour so hurry up and secure your space! Learn more about the service here. As data scientists and statisticians know, R is an excellent language for analytic problems. For large scale problems, configuring distributed Hadoop or compute clusters can be a challenge. Talented technical people can spend days or weeks building out distributed clusters, assembling all the needed software components a...
More