For convex optimization problems deterministic first order methods have linear convergence provided that the objective function is smooth (Lipschitz continuous gradient) and strongly convex. Moreover, under the same conditions – smoothness and strong convexity – stochastic first order methods have sublinear convergence rates. However, in many applications (machine learning, statistics, control, signal processing) the smoothness/strong convexity conditions do not hold; but the objective function still has a special structure (e.g. composition of a strongly convex function with a linear map). In this talk we replace the smoothness/strong convexity assumptions with several other conditions, that are less conservative, for which we prove that several (stochastic) first order methods are converging linearly. We also provide necessary conditions for linear convergence of (stochastic) gradient method. Finally, we provide examples of several functional classes satisfying our new conditions and discuss several applications of these results (Lasso problem, linear systems, linear programming, convex feasibility, etc).
This technical talk will show live calculations in Mathematica 11 and other Wolfram technologies relevant to courses and research. Specific topics include:* Visualize data, functions, surfaces, and more in 2D or 3D* Store and share documents locally or in the Wolfram Cloud* Use the Predictive Interface to get suggestions for the next useful calculation or function options* Access trillions of bits of on-demand data* Easily turn static examples into mouse-driven, dynamic applications* Get deep support for specialized areas including machine learning, time series, image processing, parallelization, and control systems, and 3D printing with no add-ons required* Compute in the Wolfram CloudCurrent users will benefit from seeing the many improvements and new features of Mathematica 11 (https://www.wolfram.com/mathematica/new-in-11/), but prior knowledge of Mathematica is not required.
Empirical risk minimization (ERM) problems express optimal classifiers as solutions of optimization problems in which the objective is the sum of a very large number of sample costs. Established approaches to solve ERM rely on computing stochastic gradient directions by accessing a single summand at each iteration. Despite the efficiency of individual iterations, these methods can be slow to converge and have convergence rates that are linear at best. In this talk we discuss approaches to adapt Newton and quasi-Newton methods for ERM problems. In the incremental quasi-Newton method we exploit memory to store curvature approximation matrices. We show that these curvature approximations succeed in approximating the Hessian and thereby lead to superlinear convergence. In the Adaptive Newton method we consider subsets of training samples that are augmented geometrically by a factor of two. Each time the training set is augmented we perform a single Newton step. We show that it is possible to achieve statistical accuracy with just two passes over the dataset.
We present a model for clustering which combines two criteria: Given a collection of objects with pairwise similarity measure, the problem is to find a cluster that is as dissimilar as possible from the complement, while having as much similarity as possible within the cluster. The two objectives are combined either as a ratio or with linear weights. The ratio problem, and its linear weighted version, are solved by a combinatorial algorithm within the complexity of a single minimum s,t-cut algorithm. This problem (HNC) is closely related to the NP-hard problem of normalized cut that is often used in image segmentation and for which heuristic solutions are generated with the eigenvector technique (spectral method).
It is demonstrated that HNC is a form of relaxation of normalized cut and its generalization to "q-normalized cut". We study the relationship between this HNC relaxation and the spectral method and demonstrate a number of advantages for the combinatorial algorithm. These advantages include a better approximation, in practice, of the normalized cut objective for image segmentation benchmark problems.
HNC can be utilized as a supervised or unsupervised machine learning technique. It has been used for data mining, and its comparison to leading machine learning techniques on datasets selected from the UCI data mining benchmark and other benchmarks indicates that its use of pairwise comparisons is powerful in improving performance accuracy. HNC is currently the leading neuron segmentation algorithm in the Neurofinder benchmark for cell identification in calcium imaging movies. Time permitting, we will discuss our recently developed methods employed to make the family of HNC algorithms scalable for massive scale data sets.
Please join the ISE Department at the annual ISE Banquet! This year's cocktail reception & banquet will be held in the ASA Packer Dining Room, University Center. The Cocktail Reception will begin at 5:30 pm and then the banquet at 6:30 pm!
If you are interested in attending, please register here: https://www.eventville.com/catalog/eventregistration1.asp?eventid=1012138.
MOSEK is a software package for solving large scale sparse optimization problems. To be precise MOSEK is capable of solving linear, convex quadratic and conic quadratic optimization problems possibly having some integer constrained variables. In addition MOSEK can solve continuous semidefinite optimization problems.
In this presentation we will review what is new and improved in the recently released version 8. We will also present computational results that documents upgrading to MOSEK version 8 provides a genuine enhancement of the numerical stability and performance.
The results also documents that MOSEK is one of the best if not the best semidefinite optimizer.
Product and content personalization is now ubiquitous in e-commerce. Available transactional data is typically too sparse for this task. As such, companies today seek to use a variety of information on the interactions between a product and a customer to drive personalization decisions. We formalize this problem as one of recovering a large-scale matrix, with side information in the form of additional matrices of conforming dimension. Viewing the matrix we seek to recover and the side information we have as slices of a tensor, we consider the problem of Slice Recovery, which is to recover specific slices of ‘simple’ tensors from noisy observations of the entire tensor. We propose a definition of simplicity that on the one hand elegantly generalizes a standard generative model for our motivating problem, and on the other subsumes low-rank tensors for a variety of existing definitions of tensor rank. We provide an efficient algorithm for slice recovery that is practical for massive datasets and provides a significant performance improvement over state of the art incumbent approaches to tensor recovery. Further, we establish near-optimal recovery guarantees that in an important regime represent an order improvement over the best available results for this problem. Experiments on data from a music streaming service demonstrate the performance and scalability of our algorithm.
Lehigh’s Industrial and Systems Engineering (ISE) Council will be holding their seventh ISE Career Fair on September 14, 2016, a day before the Lehigh University Career Fair on September 15th. Employers and students will be able to meet in a personal setting that and discuss the company’s internship/co-op/job opportunities! *This event is for both ISE & HSE (Healthcare Systems Engineering) students. All companies attending will receive a resume book. Registration fees are per company. Sponsor - $130 Sponsors of the Networking event will get a 5-minute presentation during the first half hour of the event. Sponsorship is limited! Regular - $80, Government - $50, and Non-Profit - $30 (must send a copy of 501 (c) 3).