C++ Templates for Scientific Programming_Free Document Search and Download

Page 1

C++ Templates for Scientific Programming
January 26, 2010

Page 2

About C++
▶ C++ is a general purpose language (unlike Fortran, Matlab). ▶ Runs anywhere, for all sorts of programs (unlike ..) ▶ Compiled and strongly typed (like Fortran; unlike Matlab). ▶ Can be called from other languages. ▶ Needs a bit more effort than special purpose languages.

Page 3

What is strong typing?
The stupid program
double average(double x , double y) { return (x+y )/2; }
is compiled to efficient machine instructions
pushl %ebp movl %esp , %ebp f l d l 16(%ebp) f a d d l 8(%ebp) popl %ebp fmuls .LC0 ret .LC0 : .long 1056964608

Page 4

But programs may also have a mathematical meaning double average(double x , double y) { return (x+y)/2; } Machine ↓ Mathematician ↓
pushl %ebp movl %esp , %ebp f l d l 16(%ebp) f a d d l 8(%ebp) popl %ebp fmuls .LC0 ret .LC0 : .long 1056964608
x+y 2

Page 5

Can we generalize like a mathematician?
int average( int x , int y) { return (x+y)/2; } double average(double x , double y) { return (x+y)/2; } complex average(complex x , complex y) { return (x+y)/2; } Same idea, different machine instructions.

Page 6

Templates
In C++ we have templates for generalizing: template<typename T> T average(T x, T y) { return (x+y)/2; } The compiler will try to figure out what machine code to write a = average(1, 2); a = average (1.2 , 3.4); a = average(3, 5.5); // Error , ambigous ! a = average<double >(3, 5.5); // OK a = average<int >(3, 5.5); // Also OK

Page 7

We can use old code for new tricks: matrix m1, m2, m3; m3 = average(m1, m2); // Average two matrices We can average anything with a + and /2 defined! electron_density d1, d2, da; da = average(d1, d2); // Average density With templates we can make C++ more mathematical, but still get good performance.

Page 8

With templates you can run in multiple precision1 without changing your code. Specialize! my_algorithm<float >(); // Single precision my_algorithm<double >(); //Double precision my_algorithm<qd_real >(); // Quad Double precision Automatic Differentiation (if we only had a polynomial type..): my_algorithm< polynomial<double> >(); The compiler will “cut and paste” the type (double etc.) into the code.
1httpXGG™rdFl˜lFgovG£dh˜—ileyGmpdistG

Page 9

What are the downsides?
▶ “Long” compilation times (compared to?) ▶ Easy to generate a lot (MB’s) of code ▶ Have to beware type conflicts and conversions ▶ Horrible error messages from the compiler ▶ Need some thought to design (but easy to use)

Page 10

Template classes
We can also create general data structures: template<typename T> class matrix { int rows , columns ; T ∗data ; ... }; We use it like this: matrix<double> M; matrix<complex> Mc; M. invert (); det = Mc. determinant ();

Page 11

Tensoring
And why not a block matrix? matrix< matrix<double> > M; M = M11 M12 M21 M22 Mathematical concept of Tensor product!
n×m
⊗
k×l
Will M. invert (); work?

Page 12

Recursive templates
Binomial coefficients: → k 1 1 1 1 2 1 1 3 3 1 1 4 6 4 1 1 5 10 10 5 1 ↓ n Recursive definition: n k = n − 1 k − 1 + n − 1 k

Page 13

Recursive functions
Let’s program n k = n − 1 k − 1 + n − 1 k Recursive solution: int binomial( int n , int k) { i f (k == 0 or k == n) return 1; else return binomial(n−1,k−1) + binomial(n−1,k); } Works, but super slow (2n function calls).

Page 14

Recursive template
Template solution: calculate coefficients at compile time. template <int n , int k> struct binomial { enum { value = binomial<n−1,k−1>::value + binomial<n−1,k>::value }; }; template <int n> struct binomial<n,n> // Terminate recursion { enum { value = 1 }; };

Page 15

Recursive template
Use the binomial coefficient template like this: cout << "binomial(10,9)= " << binomial <10,9>::value ; Super fast – just printing a constant instead of 1000 function calls. We cannot do for ( int i =0;i <3; i++) cout << " binomial (3 , " << i << ")=" << binomial <3,i >::value; Why not? No specific loop construct for templates, have to use recursive definitions. ¨

Page 16

A useful template class
Let’s end with a real example: “Cubic” polynomials We have 2N terms, so N probably smaller than ∼ 20.

Page 17

Cube template class
An N-cube consists of two cubes of lower dimension, template< typename T, int N > struct cube { cube< T, N−1 > lower [2]; }; except for the 0-cube, which has only one coefficient. A template specialization! template<typename T> struct cube<T,0> { T data ; };

Page 18

Arithmetics
We can add cube polynomials of the same type template< typename T, int N > cube<T,N> operator+(cube<T,N> c1 , cube<T,N> c2) { cube<T,N> res ; res . lower [0] = c1 . lower [0] + c2 . lower [0]; res . lower [1] = c1 . lower [1] + c2 . lower [1]; return res ; }

Page 19

Arithmetics
And multiply with truncation (spot the recursion?): (a + bx)(c + dx) = ac + (ad + bc)x + (x2) cube<T,N> operator ∗(cube<T,N> c1 , cube<T,N> c2) { cube<T,N> res ; res . lower [0] = c1 . lower [0]∗c2 . lower [0]; res . lower [1] = c1 . lower [0]∗c2 . lower [1] + c1 . lower [1]∗c2 . lower [0]; return res ; }

Page 20

Intrinsics
And a handful of intrinsics: f (a + bx) = f (a) + f (a)bx + (x2) cube<T,N> exp(cube<T,N> c) { cube<T,N> res ; res . lower [0] = exp(c.lower [0]); res . lower [1] = exp(c.lower [0])∗ c . lower [1]; return res ; } Other functions only slightly more complicated. Also have to take care of N = 0 case. Add const &, assignment operators etc, and..

Page 21

Super fast special purpose automatic differentiation
template<typename T> T f(T x, T y) { return x∗log(x) + sqrt(x∗y); } . . cube<double,3> x,y,z; . . // Initialize x and y z = f(x,y); // Taylor expand f The “cube” form is encountered in some problems, but is inefficient in general.

Page 22

Performance of the cube class
Multiplication performance compared to standard recursive implementation:
0.0001 0.001 0.01 0.1 1 10 100 3 4 5 6 7 8 9 tim e/m ultiply N template dynamic
Templates can be efficiently optimized for small sizes. Hybrid approaches are possible. General Taylor 50x faster with templates.

Page 23

Software
Don’t lock up useful code in complicated programs!
1. Fast (small) multivariate polynomial multiplication
http://polymul.googlecode.com
2. Taylor expansions and automatic differentiation
http://libtaylor.googlecode.com

Page 24

A general Taylor series library
I have developed a general (order and number of variables) Taylor expansion library.
▶ High order automatic differentiation. ▶ C++ template implementation. ▶ High performance (but not “fast”) multiplications. ▶ Arbitrary coefficient type (high precision possible). ▶ Standard math functions implemented. ▶ Efficient change of variables (example: rotate multipoles) ▶ GPU implementation coming.
Code is freely available. Suitable for high order problems with a small number of active variables.

Page 25

Taylor library example
template<typename T> T dist(T x, T y) { return sqrt(x∗x + y∗y); } void expand_it() { // Set up two infinitesimal variables taylor<double , 2, 3> dx(0 ,0) , dy(0 ,1); cout << " Expansion : " << dist(2+dx,3+dy) << endl ; }
Works reasonably up to ∼ 1000 coefficients.

Page 26

Taylor class
Problem specific structures from tensoring:
taylor< taylor<double, 5,N>, 3,1> x;

Page 27

Getting started with C++ templates
▶ Not covered in this talk: Expression templates ▶ Look at existing libraries, but make sure they are up to date! ▶ Make sure your compiler is up to date (i.e. GCC) ▶ Get a good modern C++ reference and start programming
Why templates?
▶ Brings back flexibility in a compiled language ▶ Reusable concepts ▶ High performance

C++ Templates for Scientific Programming

Recent Documents:

Recent Search: