Reference documentation for deal.II version GIT 46e385c35d 20221129 03:20:01+00:00

This tutorial depends on step2.
Table of contents  

This is the first example where we actually use finite elements to compute something. We will solve a simple version of Poisson's equation with zero boundary values, but a nonzero right hand side:
\begin{align*} \Delta u &= f \qquad\qquad & \text{in}\ \Omega, \\ u &= 0 \qquad\qquad & \text{on}\ \partial\Omega. \end{align*}
We will solve this equation on the square, \(\Omega=[1,1]^2\), for which you've already learned how to generate a mesh in step1 and step2. In this program, we will also only consider the particular case \(f(\mathbf x)=1\) and come back to how to implement the more general case in the next tutorial program, step4.
If you've learned about the basics of the finite element method, you will remember the steps we need to take to approximate the solution \(u\) by a finite dimensional approximation. Specifically, we first need to derive the weak form of the equation above, which we obtain by multiplying the equation by a test function \(\varphi\) from the left (we will come back to the reason for multiplying from the left and not from the right below) and integrating over the domain \(\Omega\):
\begin{align*} \int_\Omega \varphi \Delta u = \int_\Omega \varphi f. \end{align*}
This can be integrated by parts:
\begin{align*} \int_\Omega \nabla\varphi \cdot \nabla u  \int_{\partial\Omega} \varphi \mathbf{n}\cdot \nabla u = \int_\Omega \varphi f. \end{align*}
The test function \(\varphi\) has to satisfy the same kind of boundary conditions (in mathematical terms: it needs to come from the tangent space of the set in which we seek the solution), so on the boundary \(\varphi=0\) and consequently the weak form we are looking for reads
\begin{align*} (\nabla\varphi, \nabla u) = (\varphi, f), \end{align*}
where we have used the common notation \((a,b)=\int_\Omega a\; b\). The problem then asks for a function \(u\) for which this statement is true for all test functions \(\varphi\) from the appropriate space (which here is the space \(H^1\)).
Of course we can't find such a function on a computer in the general case, and instead we seek an approximation \(u_h(\mathbf x)=\sum_j U_j \varphi_j(\mathbf x)\), where the \(U_j\) are unknown expansion coefficients we need to determine (the "degrees of freedom" of this problem), and \(\varphi_i(\mathbf x)\) are the finite element shape functions we will use. To define these shape functions, we need the following:
Through these steps, we now have a set of functions \(\varphi_i\), and we can define the weak form of the discrete problem: Find a function \(u_h\), i.e., find the expansion coefficients \(U_j\) mentioned above, so that
\begin{align*} (\nabla\varphi_i, \nabla u_h) = (\varphi_i, f), \qquad\qquad i=0\ldots N1. \end{align*}
Note that we here follow the convention that everything is counted starting at zero, as common in C and C++. This equation can be rewritten as a linear system if you insert the representation \(u_h(\mathbf x)=\sum_j U_j \varphi_j(\mathbf x)\) and then observe that
\begin{align*} (\nabla\varphi_i, \nabla u_h) &= \left(\nabla\varphi_i, \nabla \Bigl[\sum_j U_j \varphi_j\Bigr]\right) \\ &= \sum_j \left(\nabla\varphi_i, \nabla \left[U_j \varphi_j\right]\right) \\ &= \sum_j \left(\nabla\varphi_i, \nabla \varphi_j \right) U_j. \end{align*}
With this, the problem reads: Find a vector \(U\) so that
\begin{align*} A U = F, \end{align*}
where the matrix \(A\) and the right hand side \(F\) are defined as
\begin{align*} A_{ij} &= (\nabla\varphi_i, \nabla \varphi_j), \\ F_i &= (\varphi_i, f). \end{align*}
Before we move on with describing how these quantities can be computed, note that if we had multiplied the original equation from the right by a test function rather than from the left, then we would have obtained a linear system of the form
\begin{align*} U^T A = F^T \end{align*}
with a row vector \(F^T\). By transposing this system, this is of course equivalent to solving
\begin{align*} A^T U = F \end{align*}
which here is the same as above since \(A=A^T\). But in general is not, and in order to avoid any sort of confusion, experience has shown that simply getting into the habit of multiplying the equation from the left rather than from the right (as is often done in the mathematical literature) avoids a common class of errors as the matrix is automatically correct and does not need to be transposed when comparing theory and implementation. See step9 for the first example in this tutorial where we have a nonsymmetric bilinear form for which it makes a difference whether we multiply from the right or from the left.
Now we know what we need (namely: objects that hold the matrix and vectors, as well as ways to compute \(A_{ij},F_i\)), and we can look at what it takes to make that happen:
\begin{align*} A_{ij} &= (\nabla\varphi_i, \nabla \varphi_j) = \sum_{K \in {\mathbb T}} \int_K \nabla\varphi_i \cdot \nabla \varphi_j, \\ F_i &= (\varphi_i, f) = \sum_{K \in {\mathbb T}} \int_K \varphi_i f, \end{align*}
and then approximate each cell's contribution by quadrature:\begin{align*} A^K_{ij} &= \int_K \nabla\varphi_i \cdot \nabla \varphi_j \approx \sum_q \nabla\varphi_i(\mathbf x^K_q) \cdot \nabla \varphi_j(\mathbf x^K_q) w_q^K, \\ F^K_i &= \int_K \varphi_i f \approx \sum_q \varphi_i(\mathbf x^K_q) f(\mathbf x^K_q) w^K_q, \end{align*}
where \(\mathbb{T} \approx \Omega\) is a Triangulation approximating the domain, \(\mathbf x^K_q\) is the \(q\)th quadrature point on cell \(K\), and \(w^K_q\) the \(q\)th quadrature weight. There are different parts to what is needed in doing this, and we will discuss them in turn next.The process of computing the matrix and right hand side as a sum over all cells (and then a sum over quadrature points) is usually called assembling the linear system, or assembly for short, using the meaning of the word related to assembly line, meaning "the act of putting together a set of pieces, fragments, or elements".
FEValues really is the central class in the assembly process. One way you can view it is as follows: The FiniteElement and derived classes describe shape functions, i.e., infinite dimensional objects: functions have values at every point. We need this for theoretical reasons because we want to perform our analysis with integrals over functions. However, for a computer, this is a very difficult concept, since they can in general only deal with a finite amount of information, and so we replace integrals by sums over quadrature points that we obtain by mapping (the Mapping object) using points defined on a reference cell (the Quadrature object) onto points on the real cell. In essence, we reduce the problem to one where we only need a finite amount of information, namely shape function values and derivatives, quadrature weights, normal vectors, etc, exclusively at a finite set of points. The FEValues class is the one that brings the three components together and provides this finite set of information on a particular cell \(K\). You will see it in action when we assemble the linear system below.
It is noteworthy that all of this could also be achieved if you simply created these three objects yourself in an application program, and juggled the information yourself. However, this would neither be simpler (the FEValues class provides exactly the kind of information you actually need) nor faster: the FEValues class is highly optimized to only compute on each cell the particular information you need; if anything can be reused from the previous cell, then it will do so, and there is a lot of code in that class to make sure things are cached wherever this is advantageous.
The final piece of this introduction is to mention that after a linear system is obtained, it is solved using an iterative solver and then postprocessed: we create an output file using the DataOut class that can then be visualized using one of the common visualization programs.
For a finite element program, the linear system we end up with here is relatively small: The matrix has size \(1089 \times 1089\), owing to the fact that the mesh we use is \(32\times 32\) and so there are \(33^2=1089\) vertices in the mesh. In many of the later tutorial programs, matrix sizes in the range of tens of thousands to hundreds of thousands will not be uncommon, and with codes such as ASPECT that build on deal.II, we regularly solve problems with more than a hundred million equations (albeit using parallel computers). In any case, even for the small system here, the matrix is much larger than what one typically encounters in an undergraduate or most graduate courses, and so the question arises how we can solve such linear systems.
The first method one typically learns for solving linear systems is Gaussian elimination. The problem with this method is that it requires a number of operations that is proportional to \(N^3\), where \(N\) is the number of equations or unknowns in the linear system – more specifically, the number of operations is \(\frac 23 N^3\), give or take a few. With \(N=1089\), this means that we would have to do around \(861\) million operations. This is a number that is quite feasible and it would take modern processors less than 0.1 seconds to do this. But it is clear that this isn't going to scale: If we have twenty times as many equations in the linear system (that is, twenty times as many unknowns), then it would already take 100010,000 seconds or on the order of an hour. Make the linear system another ten times larger, and it is clear that we can not solve it any more on a single computer.
One can rescue the situation somewhat by realizing that only a relatively small number of entries in the matrix are nonzero – that is, the matrix is sparse. Variations of Gaussian elimination can exploit this, making the process substantially faster; we will use one such method – implemented in the SparseDirectUMFPACK class – in step29 for the first time, among several others than come after that. These variations of Gaussian elimination might get us to problem sizes on the order of 100,000 or 200,000, but not all that much beyond that.
Instead, what we will do here is take up an idea from 1952: the Conjugate Gradient method, or in short "CG". CG is an "iterative" solver in that it forms a sequence of vectors that converge to the exact solution; in fact, after \(N\) such iterations in the absence of roundoff errors it finds the exact solution if the matrix is symmetric and positive definite. The method was originally developed as another way to solve a linear system exactly, like Gaussian elimination, but as such it had few advantages and was largely forgotten for a few decades. But, when computers became powerful enough to solve problems of a size where Gaussian elimination doesn't work well any more (sometime in the 1980s), CG was rediscovered as people realized that it is well suited for large and sparse systems like the ones we get from the finite element method. This is because (i) the vectors it computes converge to the exact solution, and consequently we do not actually have to do all \(N\) iterations to find the exact solution as long as we're happy with reasonably good approximations; and (ii) it only ever requires matrixvector products, which is very useful for sparse matrices because a sparse matrix has, by definition, only \({\cal O}(N)\) entries and so a matrixvector product can be done with \({\cal O}(N)\) effort whereas it costs \(N^2\) operations to do the same for dense matrices. As a consequence, we can hope to solve linear systems with at most \({\cal O}(N^2)\) operations, and in many cases substantially fewer.
Finite element codes therefore almost always use iterative solvers such as CG for the solution of the linear systems, and we will do so in this code as well. (We note that the CG method is only usable for matrices that are symmetric and positive definite; for other equations, the matrix may not have these properties and we will have to use other variations of iterative solvers such as BiCGStab or GMRES that are applicable to more general matrices.)
An important component of these iterative solvers is that we specify the tolerance with which we want to solve the linear system – in essence, a statement about the error we are willing to accept in our approximate solution. The error in an approximate solution \(\tilde x\) obtained to the exact solution \(x\) of a linear system \(Ax=b\) is defined as \(\x\tilde x\\), but this is a quantity we cannot compute because we don't know the exact solution \(x\). Instead, we typically consider the residual, defined as \(\bA\tilde x\=\A(x\tilde x)\\), as a computable measure. We then let the iterative solver compute more and more accurate solutions \(\tilde x\), until \(\bA\tilde x\\le \tau\). A practical question is what value \(\tau\) should have. In most applications, setting
\begin{align*} \tau = 10^{6} \b\ \end{align*}
is a reasonable choice. The fact that we make \(\tau\) proportional to the size (norm) of \(b\) makes sure that our expectations of the accuracy in the solution are relative to the size of the solution. This makes sense: If we make the right hand side \(b\) ten times larger, then the solution \(x\) of \(Ax=b\) will also be ten times larger, and so will \(\tilde x\); we want the same number of accurate digits in \(\tilde x\) as before, which means that we should also terminate when the residual \(\bA\tilde x\\) is ten times the original size – which is exactly what we get if we make \(\tau\) proportional to \(\b\\).
All of this will be implemented in the Step3::solve()
function in this program. As you will see, it is quite simple to set up linear solvers with deal.II: The whole function will have only three lines.
Although this is the simplest possible equation you can solve using the finite element method, this program shows the basic structure of most finite element programs and also serves as the template that almost all of the following programs will essentially follow. Specifically, the main class of this program looks like this:
This follows the object oriented programming mantra of data encapsulation, i.e. we do our best to hide almost all internal details of this class in private members that are not accessible to the outside.
Let's start with the member variables: These follow the building blocks we have outlined above in the bullet points, namely we need a Triangulation and a DoFHandler object, and a finite element object that describes the kinds of shape functions we want to use. The second group of objects relate to the linear algebra: the system matrix and right hand side as well as the solution vector, and an object that describes the sparsity pattern of the matrix. This is all this class needs (and the essentials that any solver for a stationary PDE requires) and that needs to survive throughout the entire program. In contrast to this, the FEValues object we need for assembly is only required throughout assembly, and so we create it as a local object in the function that does that and destroy it again at its end.
Secondly, let's look at the member functions. These, as well, already form the common structure that almost all following tutorial programs will use:
make_grid()
: This is what one could call a preprocessing function. As its name suggests, it sets up the object that stores the triangulation. In later examples, it could also deal with boundary conditions, geometries, etc. setup_system()
: This then is the function in which all the other data structures are set up that are needed to solve the problem. In particular, it will initialize the DoFHandler object and correctly size the various objects that have to do with the linear algebra. This function is often separated from the preprocessing function above because, in a time dependent program, it may be called at least every few time steps whenever the mesh is adaptively refined (something we will see how to do in step6). On the other hand, setting up the mesh itself in the preprocessing function above is done only once at the beginning of the program and is, therefore, separated into its own function. assemble_system()
: This, then is where the contents of the matrix and right hand side are computed, as discussed at length in the introduction above. Since doing something with this linear system is conceptually very different from computing its entries, we separate it from the following function. solve()
: This then is the function in which we compute the solution \(U\) of the linear system \(AU=F\). In the current program, this is a simple task since the matrix is so simple, but it will become a significant part of a program's size whenever the problem is not so trivial any more (see, for example, step20, step22, or step31 once you've learned a bit more about the library). output_results()
: Finally, when you have computed a solution, you probably want to do something with it. For example, you may want to output it in a format that can be visualized, or you may want to compute quantities you are interested in: say, heat fluxes in a heat exchanger, air friction coefficients of a wing, maximum bridge loads, or simply the value of the numerical solution at a point. This function is therefore the place for postprocessing your solution. All of this is held together by the single public function (other than the constructor), namely the run()
function. It is the one that is called from the place where an object of this type is created, and it is the one that calls all the other functions in their proper order. Encapsulating this operation into the run()
function, rather than calling all the other functions from main()
makes sure that you can change how the separation of concerns within this class is implemented. For example, if one of the functions becomes too big, you can split it up into two, and the only places you have to be concerned about changing as a consequence are within this very same class, and not anywhere else.
As mentioned above, you will see this general structure — sometimes with variants in spelling of the functions' names, but in essentially this order of separation of functionality — again in many of the following tutorial programs.
deal.II defines a number of integral types via alias in namespace types. (In the previous sentence, the word "integral" is used as the adjective that corresponds to the noun "integer". It shouldn't be confused with the noun "integral" that represents the area or volume under a curve or surface. The adjective "integral" is widely used in the C++ world in contexts such as "integral type", "integral constant", etc.) In particular, in this program you will see types::global_dof_index in a couple of places: an integer type that is used to denote the global index of a degree of freedom, i.e., the index of a particular degree of freedom within the DoFHandler object that is defined on top of a triangulation (as opposed to the index of a particular degree of freedom within a particular cell). For the current program (as well as almost all of the tutorial programs), you will have a few thousand to maybe a few million unknowns globally (and, for \(Q_1\) elements, you will have 4 locally on each cell in 2d and 8 in 3d). Consequently, a data type that allows to store sufficiently large numbers for global DoF indices is unsigned int
given that it allows to store numbers between 0 and slightly more than 4 billion (on most systems, where integers are 32bit). In fact, this is what types::global_dof_index is.
So, why not just use unsigned int
right away? deal.II used to do this until version 7.3. However, deal.II supports very large computations (via the framework discussed in step40) that may have more than 4 billion unknowns when spread across a few thousand processors. Consequently, there are situations where unsigned int
is not sufficiently large and we need a 64bit unsigned integral type. To make this possible, we introduced types::global_dof_index which by default is defined as simply unsigned int
whereas it is possible to define it as unsigned long long int
if necessary, by passing a particular flag during configuration (see the ReadMe file).
This covers the technical aspect. But there is also a documentation purpose: everywhere in the library and codes that are built on it, if you see a place using the data type types::global_dof_index, you immediately know that the quantity that is being referenced is, in fact, a global dof index. No such meaning would be apparent if we had just used unsigned int
(which may also be a local index, a boundary indicator, a material id, etc.). Immediately knowing what a variable refers to also helps avoid errors: it's quite clear that there must be a bug if you see an object of type types::global_dof_index being assigned to variable of type types::subdomain_id, even though they are both represented by unsigned integers and the compiler will, consequently, not complain.
In more practical terms what the presence of this type means is that during assembly, we create a \(4\times 4\) matrix (in 2d, using a \(Q_1\) element) of the contributions of the cell we are currently sitting on, and then we need to add the elements of this matrix to the appropriate elements of the global (system) matrix. For this, we need to get at the global indices of the degrees of freedom that are local to the current cell, for which we will always use the following piece of the code:
where local_dof_indices
is declared as
The name of this variable might be a bit of a misnomer – it stands for "the global indices of those degrees of freedom locally defined on the current cell" – but variables that hold this information are universally named this way throughout the library.
unsigned int
corresponds to, say, a material indicator.These include files are already known to you. They declare the classes which handle triangulations and enumeration of degrees of freedom:
And this is the file in which the functions are declared that create grids:
This file contains the description of the Lagrange interpolation finite element:
And this file is needed for the creation of sparsity patterns of sparse matrices, as shown in previous examples:
The next two files are needed for assembling the matrix using quadrature on each cell. The classes declared in them will be explained below:
The following three include files we need for the treatment of boundary values:
We're now almost to the end. The second to last group of include files is for the linear algebra which we employ to solve the system of equations arising from the finite element discretization of the Laplace equation. We will use vectors and full matrices for assembling the system of equations locally on each cell, and transfer the results into a sparse matrix. We will then use a Conjugate Gradient solver to solve the problem, for which we need a preconditioner (in this program, we use the identity preconditioner which does nothing, but we need to include the file anyway):
Finally, this is for output to a file and to the console:
...and this is to import the deal.II namespace into the global scope:
Step3
classInstead of the procedural programming of previous examples, we encapsulate everything into a class for this program. The class consists of functions which each perform certain aspects of a finite element program, a main
function which controls what is done first and what is done next, and a list of member variables.
The public part of the class is rather short: it has a constructor and a function run
that is called from the outside and acts as something like the main
function: it coordinates which operations of this class shall be run in which order. Everything else in the class, i.e. all the functions that actually do anything, are in the private section of the class:
Then there are the member functions that mostly do what their names suggest and whose have been discussed in the introduction already. Since they do not need to be called from outside, they are made private to this class.
And finally we have some member variables. There are variables describing the triangulation and the global numbering of the degrees of freedom (we will specify the exact polynomial degree of the finite element in the constructor of this class)...
...variables for the sparsity pattern and values of the system matrix resulting from the discretization of the Laplace equation...
...and variables which will hold the right hand side and solution vectors.
Here comes the constructor. It does not much more than first to specify that we want bilinear elements (denoted by the parameter to the finite element object, which indicates the polynomial degree), and to associate the dof_handler variable to the triangulation we use. (Note that the triangulation isn't set up with a mesh at all at the present time, but the DoFHandler doesn't care: it only wants to know which triangulation it will be associated with, and it only starts to care about an actual mesh once you try to distribute degree of freedom on the mesh using the distribute_dofs() function.) All the other member variables of the Step3 class have a default constructor which does all we want.
Now, the first thing we've got to do is to generate the triangulation on which we would like to do our computation and number each vertex with a degree of freedom. We have seen these two steps in step1 and step2 before, respectively.
This function does the first part, creating the mesh. We create the grid and refine all cells five times. Since the initial grid (which is the square \([1,1] \times [1,1]\)) consists of only one cell, the final grid has 32 times 32 cells, for a total of 1024.
Unsure that 1024 is the correct number? We can check that by outputting the number of cells using the n_active_cells()
function on the triangulation.
triangulation.n_cells()
instead in the code above, you would consequently get a value of 1365 instead. On the other hand, the number of cells (as opposed to the number of active cells) is not typically of much interest, so there is no good reason to print it.Next we enumerate all the degrees of freedom and set up matrix and vector objects to hold the system data. Enumerating is done by using DoFHandler::distribute_dofs(), as we have seen in the step2 example. Since we use the FE_Q class and have set the polynomial degree to 1 in the constructor, i.e. bilinear elements, this associates one degree of freedom with each vertex. While we're at generating output, let us also take a look at how many degrees of freedom are generated:
There should be one DoF for each vertex. Since we have a 32 times 32 grid, the number of DoFs should be 33 times 33, or 1089.
As we have seen in the previous example, we set up a sparsity pattern by first creating a temporary structure, tagging those entries that might be nonzero, and then copying the data over to the SparsityPattern object that can then be used by the system matrix.
Note that the SparsityPattern object does not hold the values of the matrix, it only stores the places where entries are. The entries themselves are stored in objects of type SparseMatrix, of which our variable system_matrix is one.
The distinction between sparsity pattern and matrix was made to allow several matrices to use the same sparsity pattern. This may not seem relevant here, but when you consider the size which matrices can have, and that it may take some time to build the sparsity pattern, this becomes important in largescale problems if you have to store several matrices in your program.
The last thing to do in this function is to set the sizes of the right hand side vector and the solution vector to the right values:
The next step is to compute the entries of the matrix and right hand side that form the linear system from which we compute the solution. This is the central function of each finite element program and we have discussed the primary steps in the introduction already.
The general approach to assemble matrices and vectors is to loop over all cells, and on each cell compute the contribution of that cell to the global matrix and right hand side by quadrature. The point to realize now is that we need the values of the shape functions at the locations of quadrature points on the real cell. However, both the finite element shape functions as well as the quadrature points are only defined on the reference cell. They are therefore of little help to us, and we will in fact hardly ever query information about finite element shape functions or quadrature points from these objects directly.
Rather, what is required is a way to map this data from the reference cell to the real cell. Classes that can do that are derived from the Mapping class, though one again often does not have to deal with them directly: many functions in the library can take a mapping object as argument, but when it is omitted they simply resort to the standard bilinear Q1 mapping. We will go this route, and not bother with it for the moment (we come back to this in step10, step11, and step12).
So what we now have is a collection of three classes to deal with: finite element, quadrature, and mapping objects. That's too much, so there is one type of class that orchestrates information exchange between these three: the FEValues class. If given one instance of each three of these objects (or two, and an implicit linear mapping), it will be able to provide you with information about values and gradients of shape functions at quadrature points on a real cell.
Using all this, we will assemble the linear system for this problem in the following function:
Ok, let's start: we need a quadrature formula for the evaluation of the integrals on each cell. Let's take a Gauss formula with two quadrature points in each direction, i.e. a total of four points since we are in 2D. This quadrature formula integrates polynomials of degrees up to three exactly (in 1D). It is easy to check that this is sufficient for the present problem:
And we initialize the object which we have briefly talked about above. It needs to be told which finite element we want to use, and the quadrature points and their weights (jointly described by a Quadrature object). As mentioned, we use the implied Q1 mapping, rather than specifying one ourselves explicitly. Finally, we have to tell it what we want it to compute on each cell: we need the values of the shape functions at the quadrature points (for the right hand side \((\varphi_i,f)\)), their gradients (for the matrix entries \((\nabla \varphi_i, \nabla \varphi_j)\)), and also the weights of the quadrature points and the determinants of the Jacobian transformations from the reference cell to the real cells.
This list of what kind of information we actually need is given as a collection of flags as the third argument to the constructor of FEValues. Since these values have to be recomputed, or updated, every time we go to a new cell, all of these flags start with the prefix update_
and then indicate what it actually is that we want updated. The flag to give if we want the values of the shape functions computed is update_values; for the gradients it is update_gradients. The determinants of the Jacobians and the quadrature weights are always used together, so only the products (Jacobians times weights, or short JxW
) are computed; since we need them, we have to list update_JxW_values as well:
The advantage of this approach is that we can specify what kind of information we actually need on each cell. It is easily understandable that this approach can significantly speed up finite element computations, compared to approaches where everything, including second derivatives, normal vectors to cells, etc are computed on each cell, regardless of whether they are needed or not.
update_values  update_gradients  update_JxW_values
is not immediately obvious to anyone not used to programming bit operations in C for years already. First, operator
is the bitwise or operator, i.e., it takes two integer arguments that are interpreted as bit patterns and returns an integer in which every bit is set for which the corresponding bit is set in at least one of the two arguments. For example, consider the operation 910
. In binary, 9=0b1001
(where the prefix 0b
indicates that the number is to be interpreted as a binary number) and 10=0b1010
. Going through each bit and seeing whether it is set in one of the argument, we arrive at 0b10010b1010=0b1011
or, in decimal notation, 910=11
. The second piece of information you need to know is that the various update_*
flags are all integers that have exactly one bit set. For example, assume that update_values=0b00001=1
, update_gradients=0b00010=2
, update_JxW_values=0b10000=16
. Then update_values  update_gradients  update_JxW_values = 0b10011 = 19
. In other words, we obtain a number that encodes a binary mask representing all of the operations you want to happen, where each operation corresponds to exactly one bit in the integer that, if equal to one, means that a particular piece should be updated on each cell and, if it is zero, means that we need not compute it. In other words, even though operator
is the bitwise OR operation, what it really represents is I want this AND that AND the other. Such binary masks are quite common in C programming, but maybe not so in higher level languages like C++, but serve the current purpose quite well.For use further down below, we define a shortcut for a value that will be used very frequently. Namely, an abbreviation for the number of degrees of freedom on each cell (since we are in 2D and degrees of freedom are associated with vertices only, this number is four, but we rather want to write the definition of this variable in a way that does not preclude us from later choosing a different finite element that has a different number of degrees of freedom per cell, or work in a different space dimension).
In general, it is a good idea to use a symbolic name instead of hardcoding these numbers even if you know them, since for example, you may want to change the finite element at some time. Changing the element would have to be done in a different function and it is easy to forget to make a corresponding change in another part of the program. It is better to not rely on your own calculations, but instead ask the right object for the information: Here, we ask the finite element to tell us about the number of degrees of freedom per cell and we will get the correct number regardless of the space dimension or polynomial degree we may have chosen elsewhere in the program.
The shortcut here, defined primarily to discuss the basic concept and not because it saves a lot of typing, will then make the following loops a bit more readable. You will see such shortcuts in many places in larger programs, and dofs_per_cell
is one that is more or less the conventional name for this kind of object.
Now, we said that we wanted to assemble the global matrix and vector cellbycell. We could write the results directly into the global matrix, but this is not very efficient since access to the elements of a sparse matrix is slow. Rather, we first compute the contribution of each cell in a small matrix with the degrees of freedom on the present cell, and only transfer them to the global matrix when the computations are finished for this cell. We do the same for the right hand side vector. So let's first allocate these objects (these being local objects, all degrees of freedom are coupling with all others, and we should use a full matrix object rather than a sparse one for the local operations; everything will be transferred to a global sparse matrix later on):
When assembling the contributions of each cell, we do this with the local numbering of the degrees of freedom (i.e. the number running from zero through dofs_per_cell1). However, when we transfer the result into the global matrix, we have to know the global numbers of the degrees of freedom. When we query them, we need a scratch (temporary) array for these numbers (see the discussion at the end of the introduction for the type, types::global_dof_index, used here):
Now for the loop over all cells. We have seen before how this works for a triangulation. A DoFHandler has cell iterators that are exactly analogous to those of a Triangulation, but with extra information about the degrees of freedom for the finite element you're using. Looping over the active cells of a degreeoffreedom handler works the same as for a triangulation.
Note that we declare the type of the cell as const auto &
instead of auto
this time around. In step 1, we were modifying the cells of the triangulation by flagging them with refinement indicators. Here we're only examining the cells without modifying them, so it's good practice to declare cell
as const
in order to enforce this invariant.
We are now sitting on one cell, and we would like the values and gradients of the shape functions be computed, as well as the determinants of the Jacobian matrices of the mapping between reference cell and true cell, at the quadrature points. Since all these values depend on the geometry of the cell, we have to have the FEValues object recompute them on each cell:
Next, reset the local cell's contributions to global matrix and global right hand side to zero, before we fill them:
Now it is time to start integration over the cell, which we do by looping over all quadrature points, which we will number by q_index.
First assemble the matrix: For the Laplace problem, the matrix on each cell is the integral over the gradients of shape function i and j. Since we do not integrate, but rather use quadrature, this is the sum over all quadrature points of the integrands times the determinant of the Jacobian matrix at the quadrature point times the weight of this quadrature point. You can get the gradient of shape function \(i\) at quadrature point with number q_index by using fe_values.shape_grad(i,q_index)
; this gradient is a 2dimensional vector (in fact it is of type Tensor<1,dim>, with here dim=2) and the product of two such vectors is the scalar product, i.e. the product of the two shape_grad function calls is the dot product. This is in turn multiplied by the Jacobian determinant and the quadrature point weight (that one gets together by the call to FEValues::JxW() ). Finally, this is repeated for all shape functions \(i\) and \(j\):
We then do the same thing for the right hand side. Here, the integral is over the shape function i times the right hand side function, which we choose to be the function with constant value one (more interesting examples will be considered in the following programs).
Now that we have the contribution of this cell, we have to transfer it to the global matrix and right hand side. To this end, we first have to find out which global numbers the degrees of freedom on this cell have. Let's simply ask the cell for that information:
Then again loop over all shape functions i and j and transfer the local elements to the global matrix. The global numbers can be obtained using local_dof_indices[i]:
And again, we do the same thing for the right hand side vector.
Now almost everything is set up for the solution of the discrete system. However, we have not yet taken care of boundary values (in fact, Laplace's equation without Dirichlet boundary values is not even uniquely solvable, since you can add an arbitrary constant to the discrete solution). We therefore have to do something about the situation.
For this, we first obtain a list of the degrees of freedom on the boundary and the value the shape function shall have there. For simplicity, we only interpolate the boundary value function, rather than projecting it onto the boundary. There is a function in the library which does exactly this: VectorTools::interpolate_boundary_values(). Its parameters are (omitting parameters for which default values exist and that we don't care about): the DoFHandler object to get the global numbers of the degrees of freedom on the boundary; the component of the boundary where the boundary values shall be interpolated; the boundary value function itself; and the output object.
The component of the boundary is meant as follows: in many cases, you may want to impose certain boundary values only on parts of the boundary. For example, you may have inflow and outflow boundaries in fluid dynamics, or clamped and free parts of bodies in deformation computations of bodies. Then you will want to denote these different parts of the boundary by indicators, and tell the interpolate_boundary_values function to only compute the boundary values on a certain part of the boundary (e.g. the clamped part, or the inflow boundary). By default, all boundaries have a 0 boundary indicator, unless otherwise specified. If sections of the boundary have different boundary conditions, you have to number those parts with different boundary indicators. The function call below will then only determine boundary values for those parts of the boundary for which the boundary indicator is in fact the zero specified as the second argument.
The function describing the boundary values is an object of type Function or of a derived class. One of the derived classes is Functions::ZeroFunction, which describes (not unexpectedly) a function which is zero everywhere. We create such an object inplace and pass it to the VectorTools::interpolate_boundary_values() function.
Finally, the output object is a list of pairs of global degree of freedom numbers (i.e. the number of the degrees of freedom on the boundary) and their boundary values (which are zero here for all entries). This mapping of DoF numbers to boundary values is done by the std::map
class.
Now that we got the list of boundary DoFs and their respective boundary values, let's use them to modify the system of equations accordingly. This is done by the following function call:
The following function solves the discretized equation. As discussed in the introduction, we want to use an iterative solver to do this, specifically the Conjugate Gradient (CG) method.
The way to do this in deal.II is a threestep process:
At the end of this process, the solution
variable contains the nodal values of the solution function.
The last part of a typical finite element program is to output the results and maybe do some postprocessing (for example compute the maximal stress values at the boundary, or the average flux across the outflow, etc). We have no such postprocessing here, but we would like to write the solution to a file.
To write the output to a file, we need an object which knows about output formats and the like. This is the DataOut class, and we need an object of that type:
Now we have to tell it where to take the values from which it shall write. We tell it which DoFHandler object to use, and the solution vector (and the name by which the solution variable shall appear in the output file). If we had more than one vector which we would like to look at in the output (for example right hand sides, errors per cell, etc) we would add them as well:
After the DataOut object knows which data it is to work on, we have to tell it to process them into something the back ends can handle. The reason is that we have separated the frontend (which knows about how to treat DoFHandler objects and data vectors) from the back end (which knows many different output formats) and use an intermediate data format to transfer data from the front to the backend. The data is transformed into this intermediate format by the following function:
Now we have everything in place for the actual output. Just open a file and write the data into it, using VTK format (there are many other functions in the DataOut class we are using here that can write the data in postscript, AVS, GMV, Gnuplot, or some other file formats):
Finally, the last function of this class is the main function which calls all the other functions of the Step3
class. The order in which this is done resembles the order in which most finite element programs work. Since the names are mostly selfexplanatory, there is not much to comment about:
main
functionThis is the main function of the program. Since the concept of a main function is mostly a remnant from the preobject oriented era before C++ programming, it often does not do much more than creating an object of the toplevel class and calling its principle function.
Finally, the first line of the function is used to enable output of some diagnostics that deal.II can generate. The deallog
variable (which stands for deallog, not deallog) represents a stream to which some parts of the library write output. For example, iterative solvers will generate diagnostics (starting residual, number of solver steps, final residual) as can be seen when running this tutorial program.
The output of deallog
can be written to the console, to a file, or both. Both are disabled by default since over the years we have learned that a program should only generate output when a user explicitly asks for it. But this can be changed, and to explain how this can be done, we need to explain how deallog
works: When individual parts of the library want to log output, they open a "context" or "section" into which this output will be placed. At the end of the part that wants to write output, one exits this section again. Since a function may call another one from within the scope where this output section is open, output may in fact be nested hierarchically into these sections. The LogStream class of which deallog
is a variable calls each of these sections a "prefix" because all output is printed with this prefix at the left end of the line, with prefixes separated by colons. There is always a default prefix called "DEAL" (a hint at deal.II's history as the successor of a previous library called "DEAL" and from which the LogStream class is one of the few pieces of code that were taken into deal.II).
By default, logstream
only outputs lines with zero prefixes – i.e., all output is disabled because the default "DEAL" prefix is always there. But one can set a different maximal number of prefixes for lines that should be output to something larger, and indeed here we set it to two by calling LogStream::depth_console(). This means that for all screen output, a context that has pushed one additional prefix beyond the default "DEAL" is allowed to print its output to the screen ("console"), whereas all further nested sections that would have three or more prefixes active would write to deallog
, but deallog
does not forward this output to the screen. Thus, running this example (or looking at the "Results" section), you will see the solver statistics prefixed with "DEAL:CG", which is two prefixes. This is sufficient for the context of the current program, but you will see examples later on (e.g., in step22) where solvers are nested more deeply and where you may get useful information by setting the depth even higher.
The output of the program looks as follows:
The first two lines is what we wrote to std::cout
. The last two lines were generated by the CG solver because we called deallog.depth_console(2);
in the main()
function – see the lengthy discussion there. These two lines prefixed by DEAL:cg:
state the residual at the start of the iteration and that the solver needed 36 iterations to bring the norm of the residual to \(7.08\cdot 10^{8}\), i.e. below the threshold \(\tau\) which we have set in the solve()
function. Note that we have asked the residual to be less than \(10^{6}\) times the norm of the right hand side vector; the starting value is (here) this norm of the right hand side vector, and there is indeed more than a factor of \(10^{6}\) between the two values shown.
Apart from the output shown above, the program generated the file solution.vtk
, which is in the VTK format that is widely used by many visualization programs today – including the two heavyweights VisIt and Paraview that are the most commonly used programs for this purpose today.
Using VisIt, it is not very difficult to generate a picture of the solution like this:
It shows both the solution and the mesh, elevated above the \(x\) \(y\) plane based on the value of the solution at each point. Of course the solution here is not particularly exciting, but that is a result of both what the Laplace equation represents and the right hand side \(f(\mathbf x)=1\) we have chosen for this program: The Laplace equation describes (among many other uses) the vertical deformation of a membrane subject to an external (also vertical) force. In the current example, the membrane's borders are clamped to a square frame with no vertical variation; a constant force density will therefore intuitively lead to a membrane that simply bulges upward – like the one shown above.
VisIt and Paraview both allow playing with various kinds of visualizations of the solution. Several video lectures show how to use these programs. See also video lecture 11, video lecture 32.
If you want to play around a little bit with this program, here are a few suggestions:
Change the geometry and mesh: In the program, we have generated a square domain and mesh by using the GridGenerator::hyper_cube
function. However, the GridGenerator
has a good number of other functions as well. Try an Lshaped domain, a ring, or other domains you find there.
Change the boundary condition: The code uses the Functions::ZeroFunction function to generate zero boundary conditions. However, you may want to try nonzero constant boundary values using Functions::ConstantFunction<2>(1)
instead of Functions::ZeroFunction<2>()
to have unit Dirichlet boundary values. More exotic functions are described in the documentation of the Functions namespace, and you may pick one to describe your particular boundary values.
Modify the type of boundary condition: Presently, what happens is that we use Dirichlet boundary values all around, since the default is that all boundary parts have boundary indicator zero, and then we tell the VectorTools::interpolate_boundary_values() function to interpolate boundary values to zero on all boundary components with indicator zero.
We can change this behavior if we assign parts of the boundary different indicators. For example, try this immediately after calling GridGenerator::hyper_cube():
What this does is it first asks the triangulation to return an iterator that points to the first active cell. Of course, this being the coarse mesh for the triangulation of a square, the triangulation has only a single cell at this moment, and it is active. Next, we ask the cell to return an iterator to its first face, and then we ask the face to reset the boundary indicator of that face to 1. What then follows is this: When the mesh is refined, faces of child cells inherit the boundary indicator of their parents, i.e. even on the finest mesh, the faces on one side of the square have boundary indicator 1. Later, when we get to interpolating boundary conditions, the VectorTools::interpolate_boundary_values() call will only produce boundary values for those faces that have zero boundary indicator, and leave those faces alone that have a different boundary indicator. What this then does is to impose Dirichlet boundary conditions on the former, and homogeneous Neumann conditions on the latter (i.e. zero normal derivative of the solution, unless one adds additional terms to the right hand side of the variational equality that deal with potentially nonzero Neumann conditions). You will see this if you run the program.
An alternative way to change the boundary indicator is to label the boundaries based on the Cartesian coordinates of the face centers. For example, we can label all of the cells along the top and bottom boundaries with a boundary indicator 1 by checking to see if the cell centers' ycoordinates are within a tolerance (here 1e12) of 1 and 1. Try this immediately after calling GridGenerator::hyper_cube(), as before:
Although this code is a bit longer than before, it is useful for complex geometries, as it does not require knowledge of face labels.
A slight variation of the last point would be to set different boundary values as above, but then use a different boundary value function for boundary indicator one. In practice, what you have to do is to add a second call to interpolate_boundary_values
for boundary indicator one:
If you have this call immediately after the first one to this function, then it will interpolate boundary values on faces with boundary indicator 1 to the unit value, and merge these interpolated values with those previously computed for boundary indicator 0. The result will be that we will get discontinuous boundary values, zero on three sides of the square, and one on the fourth.
Observe convergence: We will only discuss computing errors in norms in step7, but it is easy to check that computations converge already here. For example, we could evaluate the value of the solution in a single point and compare the value for different numbers of global refinement (the number of global refinement steps is set in LaplaceProblem::make_grid
above). To evaluate the solution at a point, say at \((\frac 13, \frac 13)\), we could add the following code to the LaplaceProblem::output_results
function:
For 1 through 9 global refinement steps, we then get the following sequence of point values:
# of refinements  \(u_h(\frac 13,\frac13)\) 

1  0.166667 
2  0.227381 
3  0.237375 
4  0.240435 
5  0.241140 
6  0.241324 
7  0.241369 
8  0.241380 
9  0.241383 
By noticing that the difference between each two consecutive values reduces by about a factor of 4, we can conjecture that the "correct" value may be \(u(\frac 13, \frac 13)\approx 0.241384\). In fact, if we assumed this to be the correct value, we could show that the sequence above indeed shows \({\cal O}(h^2)\) convergence — theoretically, the convergence order should be \({\cal O}(h^2 \log h)\) but the symmetry of the domain and the mesh may lead to the better convergence order observed.
A slight variant of this would be to repeat the test with quadratic elements. All you need to do is to set the polynomial degree of the finite element to two in the constructor LaplaceProblem::LaplaceProblem
.
LaplaceProblem::output_results
: # of refinements  \(\int_\Omega u_h(x)\; dx\) 

0  0.09375000 
1  0.12790179 
2  0.13733440 
3  0.13976069 
4  0.14037251 
5  0.14052586 
6  0.14056422 
7  0.14057382 
8  0.14057622 
HDF5 is a commonly used format that can be read by many scripting languages (e.g. R or Python). It is not difficult to get deal.II to produce some HDF5 files that can then be used in external scripts to postprocess some of the data generated by this program. Here are some ideas on what is possible.
To fully make use of the automation we first need to introduce a private variable for the number of global refinement steps unsigned int n_refinement_steps
, which will be used for the output filename. In make_grid()
we then replace triangulation.refine_global(5);
with
The deal.II library has two different HDF5 bindings, one in the HDF5 namespace (for interfacing to generalpurpose data files) and another one in DataOut (specifically for writing files for the visualization of solutions). Although the HDF5 deal.II binding supports both serial and MPI, the HDF5 DataOut binding only supports parallel output. For this reason we need to initialize an MPI communicator with only one processor. This is done by adding the following code.
Next we change the Step3::output_results()
output routine as described in the DataOutBase namespace documentation:
The resulting file can then be visualized just like the VTK file that the original version of the tutorial produces; but, since HDF5 is a more general file format, it can also easily be processed in scripting languages for other purposes.
After outputting the solution, the file can be opened again to include more datasets. This allows us to keep all the necessary information of our experiment in a single result file, which can then be read and processed by some postprocessing script. (Have a look at HDF5::Group::write_dataset() for further information on the possible output options.)
To make this happen, we first include the necessary header into our file :
Adding the following lines to the end of our output routine adds the information about the value of the solution at a particular point, as well as the mean value of the solution, to our HDF5 file :
The data put into HDF5 files above can then be used from scripting languages for further postprocessing. In the following, let us show how this can, in particular, be done with the R programming language, a widely used language in statistical data analysis. (Similar things can also be done in Python, for example.) If you are unfamiliar with R and ggplot2 you could check out the data carpentry course on R here. Furthermore, since most search engines struggle with searches of the form "R + topic", we recommend using the specializes service RSeek instead.
The most prominent difference between R and other languages is that the assignment operator (a = 5
) is typically written as a < 5
. As the latter is considered standard we will use it in our examples as well. To open the .h5
file in R you have to install the rhdf5 package, which is a part of the Bioconductor package.
First we will include all necessary packages and have a look at how the data is structured in our file.
This gives the following output
The datasets can be accessed by h5f$name
. The function dim(h5f$cells)
gives us the dimensions of the matrix that is used to store our cells. We can see the following three matrices, as well as the two additional data points we added.
cells
: a 4x1024 matrix that stores the (C++) vertex indices for each cell nodes
: a 2x1089 matrix storing the position values (x,y) for our cell vertices solution
: a 1x1089 matrix storing the values of our solution at each vertex Now we can use this data to generate various plots. Plotting with ggplot2 usually splits into two steps. At first the data needs to be manipulated and added to a data.frame
. After that, a ggplot
object is constructed and manipulated by adding plot elements to it.
nodes
and cells
contain all the information we need to plot our grid. The following code wraps all the data into one dataframe for plotting our grid:
With the finished dataframe we have everything we need to plot our grid:
The contents of this file then look as follows (not very exciting, but you get the idea):
We can also visualize the solution itself, and this is going to look more interesting. To make a 2D pseudocolor plot of our solution we will use geom_raster
. This function needs a structured grid, i.e. uniform in x and y directions. Luckily our data at this point is structured in the right way. The following code plots a pseudocolor representation of our surface into a new PDF:
This is now going to look as follows:
For plotting the converge curves we need to rerun the C++ code multiple times with different values for n_refinement_steps
starting from 1. Since every file only contains a single data point we need to loop over them and concatenate the results into a single vector.
As we are not interested in the values themselves but rather in the error compared to a "exact" solution we will assume our highest refinement level to be that solution and omit it from the data.
Now we have all the data available to generate our plots. It is often useful to plot errors on a loglog scale, which is accomplished in the following code:
This results in the following plot that shows how the errors in the mean value and the solution value at the chosen point nicely converge to zero: