Median tips in Matlab

Medians are sometimes necessary when trying to characterize data. Datasets with large, infrequent outliers just tend to skew the mean too much for it to be a good estimator of the data’s bulk behavior.

I recently had a need to process a lot of FITS files with the median operation.  The details of this are not terribly important except that I need a good estimator of each pixel value across ~50 FITS frames (about  18 sets of these 50 frame chunks).  To quickly get an estimate of what I would be in for I checked out the performance of Matlab’s median operation. Doing these calculations doesn’t take a lot of time individually but collectively I was looking at about 1/2 hour to get just medians.  Out of curiousity I checked into the Profiler to examine if readin was taking longer than the processing. The images were being read at about 50Mb/sec @ 2.2Mb each so about 25 fps on my machine. Median calculations on the other hand were taking about 8sec each! — which of course depends on the type of data you are feeding the median operation — however for those keeping score that’s 6.25 fps.

I wasn’t expecting the median to be fast since a KxLx50 matrix is not the optimized layout for caching. Matlab does eventually rotate the input for this case and copies all 2Gb for sorting. I was suspect of the sorting operation and the necessary rotation but most of the time wasn’t spend doing this operations(!).

About 3/4 of the time was spent processing the case for even element median operations!

Feeding in 45 frames for median’ing raised my processing rate to 2.1sec per med op (that’s 24 fps kids).

Moral: If you can, only do median operations on odd numbers of elements {possibly matlab specific}

Extras: By the way I am processing 16bit numbers from the FITS files. Interestingly the median operation on these integers gives an actual pixel value for inputs where the number of frames median’ed is odd! When the number of frames is even the median that is computed is round(mean( [sorted_pxl_vals( N/2 ) , sorted_pxl_vals( N/2+1] ) )). This may not correspond to an actual pxl value and at least for my data significantly skews the mean of the pxl value distribution by about 3 LSBs.


Pandora’s Promise: Social Barriers to Nuclear Power

http://pandoraspromise.com/

If you haven’t seen or heard about this film, I would encourage everyone to seek it out. The ‘conversation’ about what to do with nuclear power has been and (in the US) is currently being treated with avoidance. Whatever your stance, bring your facts and your poker face… you will surely need both.


Going ‘for’ a loop in Matlab syntax

For the most part scripting in Matlab should avoid the use of for loops. Generally the operations can be much quicker in the ‘vectorized’ notation that Matlab has worked hard to optimize. The link above is worth reading although as pointed out by in the Matlab Hypertext Ref, Matlab is not necessarily the best choice for real time tasks anyway (it’s a scripting language!)… so getting the code right is more important than getting it done fast.

However if you do use for loops there are major differences between those in Matlab and those in other languages like C++. In other languages you may be able to write something like

for( int x = 0; x <= 10; x++){
x = 11;
}

//And the loop executes once!

So the loop goes around once (x=0) and then stops because (x=11 !<= 10). This is not really surprising until you look into what Matlab does in a similar situation:

 

for x = 0:1:10
x = 11;
end;
%% the loop runs 11 times

So this is not what would generally be expected. We told Matlab to go from 0 to 10 in increments of one just like our C++ code but we got a drastically different result. If we printed out the values of x between assignment and just after assignment we would see that the variable oscillates back and forth between the assigned value (10) and the incrementing values 0 to 10. This is a result of the vectorization of the loop, which includes an indexing that the user does not see. The variable x is not incremented by x++ or x=x+1, rather Matlab sub-indexes the index x with another counter variable.

Who cares?! Well if you wanted to say change the flow of work that Matlab does throughout the loop, you really can’t just assign to the x variable to get it done… you either have to switch loop types or introduce yet another counter (that’s three counters to do a single loop for those keeping score).

Anyway stick to vectors in Matlab if possible! Or save money and get Qt.


Matlab fsolve

Hey everyone, long time no post :). It’s also been some time since I have been working in Matlab, but today I found enough time to waste. I am working on a UNIQUAC model for predicting binary solution activities, and also have been trying to generate a X-Y plot for distillation. I ran into a problem however, I have a function:

%Uniquac.m
function gamma = Uniquac(r_matrix, q_matrix, a_matrix, T, x1)
... (More code here)

The extra parameters in the function above are really nice, I might have to call several different binary components with the same function. While I could hard-code the constants r_matrix, q_matrix, and a_matrix… I really feel cheap doing so. Why does it matter so much? I don’t want to corrode my perfectly general code with very specific constants. Consider the very real possibility that you might want to use this function in fsolve, a numeric solver in Matlab. Fsolve requires:

1. That the first input be a function.
2. The second input must be a ‘guess’ for a possible solution.

The problem is that when you use a function like Uniquac.m, fsolve(@Uniquac, x0) says to Matlab: “Use the matrix x0 as the starting point for meeting my constraints.” But what is x0?

Well, it’s the inputs of Uniquac.m… which are r_matrix, q_matrix, a_matrix, T and x1. But hold on… maybe I only want to allow T to change and if r_matrix, q_matrix, a_matrix, and x1 are supposed to be constants, then it would be more than desirable if Matlab didn’t change them. But in this configuration Matlab cannot distinguish between the ‘constants’ and the variables that you want to change.

So what then? How to fix this problem? Well, you could go back and ruin all that beautiful code you just wrote by hard-coding constants and making global variables… but you could also go back to dragging your fists and beating things with a club.

OR we could do something VERY hip and gets rid of the bad vibes. For this we pull out a seemingly useless friend: the anonymous function. We covered these previously but here’s the angle:

%Some other script than Uniquac.m
shortUniquac = @(T) Uniquac(r_matrix, q_matrix, a_matrix, T, x1)

Yea! Now you’ve got a good as new function shortUniquac(T) that depends only on T… without a bunch of band-aids. Now for fsolve:

%fsolver
answer = fsolve(shortUniquac, T0)

If you would like the files for the Uniquac stuff, leave a comment with your email (do not put “example@gmail.com”, rather “example \ gmail” to avoid spam)  and what you intend to use it for and I’ll send you a link.

——-Updated 04/23/2012———

User List

  • Changsoo Lee — Yonsei Univ. Korea
  • Bobby J  — The University of Edinburgh
  • Santiago — University of Los Andes; Bogotá, Colombia

Porous Media Solution

Startup

Open up COMSOL Multiphysics 4.0a (Classkit License) and we will have a screen like this:

Model Selection

If you need a general procedure for how to open models in COMSOL head over to the previous post . For this particular problem let’s use the Chemical Species Transport and the sub-model Transport in Porous Media (chpm). Select Steady State and finish your model by clicking on the flag. You could do multiple physics to solve this problem… and in fact Matthew Bible and Isaac Keebler have a very nice solution using the Porous Media and the Dilute Species Transport. Their solution is cool but I’m going to keep it simple.

Building the Model

Refer to the previous post and see how to generate geometries… here we will be making three rectangular shapes to represent our problem. With three rectangles that are 0.2 m wide and 0.8 m tall. Although maybe not required I would union the geometries.  Union by selecting Geometry in the Model Builder and selecting all the geometries under your Geometry heading and then pressing the Form Union item on the list. Follow your nose to get the union. Here’s the geometry:

Boundary Conditions

Basically you want two concentration conditions on the vertical boundaries at x=0 and x=0.6, open boundaries at the top and bottom of your geometry in you second rectangle (that’s the lines y = 0 and y = 0.8 where x \in (0.2, 0.4), a high diffusivity in the middle of your geometry (~1), a large porosity in your middle geometry (~1), and lower diffusivity and porosity on either side of your middle region (I used the defaults).

That’s it, solve your model. Solution screen shot, maybe soon.


COMSOL Intro

An important thing to remember about COMOSL 4.0 is that this interface employs the use of right clicks as shortcuts. One of the big changes from the old COMSOL gui to the new one is the suppressed menus; the upshot is a simpler looking interface… though sometimes finding stuff can be difficult. That said just don’t forget to look for menus by right clicking.

Start Up

COMSOL Multiphysics 4.0a (Classkit License) is the specific breed of software we will be using. After entering the program there should be a screen like this:

There’s a couple of things to get used to here but mainly just focus on the left hand menu bar… we are going to select a geometry there. The example problem we have is the handout given on Tuesday. If you don’t have it, just keep reading.

Model Selection

We are going to start off with a 2D model with three domains (regions). Here are the steps:

  1. COMSOL Multiphysics 4.0a (Classkit License) startup
  2. Left hand column, under Model Wizard, go ahead and Select Space Dimension (click on 2D and press the advance arrow at the top of the Model Wizard bar)
  3. Left hand column, under Model Wizard, let’s pick a physics… under Chemical Species Transport select Transport of Diluted Species. Select the type and then click the advance arrow.
  4. Left hand column, under Model Wizard, if you will Select Study Type… that’d be stationary. Try to suppress the urge to select time dependent, we can be men later. Click the flag to finish your model selection.
  5. Now left hand column, select the Model Builder tab. You’re done!

Here’s your screen if you have followed the steps:

Geometry

The subject I kicked my big brother’s butt in. Anyway let the first thing that you notice on your screen (or the second) be that along the top bar there are a good number of icons for building a certain type of geometry. You can use those, but the old COMSOL used to have an interface where you could input your dimensions of the geometry by points and side lengths, radii… basically a less visual but often more precise method. That still exists in COMSOL but it’s under the right click menus. Try this out:

  1. Left hand column, under Model Builder, left click once on Geometry then right click on Geometry to bring up the right click menu.
  2. Select rectangle and aha! there is that menu that let us build stuff.
  3. Put in values for your rectangle: height = .8 and width = .2.
  4. Press enter and… nothing happens. You need to Build All, it’s under that left hand column near the top… it looks like a building.

Boundary Conditions

Let’s have a short talk about this stuff… like what was said before, the right click menus are important here. We’ll do a sample and then suppose that the rest are similar. Check it out:

  1. Left hand column, under Model Builder, left click Transport of Diluted Species (chds) then right click the highlighted selection and a list of BCs come up.
  2. Try Concentration and check the Species C checkbox. Enter the concentration the desired concentration.
  3. Go into your Graphics window and select a boundary, then click the + button. If you have done it correctly your boundary number shows up in your list.

Here is where you can set certain physical properties like bulk density, porosity… etc. Just try to find the appropriate condition where you would find your value and it’s likely there. Do not forget to press the + or your changes will not take place.

Solution

Let’s mesh:

  1. Left hand column, under Model Builder, let’s right click the Mesh category
  2. Select your mesh
  3. Top right corner of the left hand column, Build All… it’s the building… haha.

Now your meshed. Let’s solve this puppy:

  1. Left hand column, under Model Builder, right click the category for Study.
  2. Compute

That’s it… I think there’s a COMSOL help manual somewhere crying tears of blood and shaking it’s head at me right now. Toododoloose.


Matlab Differential Equations

Syntax

These differential equations can be solved analytically but practice with numerical solvers is kind of a cool thing.

Anyway, the syntax for the ode solver is pretty straight forward and standard for the different solvers:

ode23 (@fun, [t_0 t_1], initial, options)

with the inputs, in order from left to right, being a Matlab function, time interval, initial conditions and options (which is a place for specifying precision and some other stuff). What really needs to be emphasized is that this functional input needs to be a Matlab function stored in the current working directory as a .m file that has the same name as the Matlab function. Matlab has kind of a slick way of brushing that under the carpet in their help file but the point is that your DE(s) need(s) to be stored that way; meaning what? You have to use a combination of the command line and .m file editor to solve your differential equation. An example function:

%Matlab function f.m
function dy = f(t,y)
k1 = 2;
k2 = .5;
dy = zeros(2,1);
dy(1) = -k1*y(1);
dy(2) = -k2*y(2) + k1*y(1);
end

Here we are solving \frac{dy_1}{dt} = -k_1 y_1 and \frac{dy_2}{dt} = -k_2 y_2 + k_1 y_1. Save this as f.m. Then solve the system at the command prompt:

ode23(@f, [0 20], [1 0])

A quick check, if this is not working have you:

  1. opened and saved your function containing your DE(s) as f.m
  2. saved this file in your current directory? If you don’t know what this is type cd at the Matlab command prompt.
  3. been prompted to “Add to path”, “Make current path”, etc? Adding to the path is easiest.
  4. any variables that you were using previous to this that would have the names “y” , “t”, “k1” or “k2”? Try clear variables at the command prompt.
  5. copied and pasted the code. Sometimes it’s easy to small stuff.

Probably the oddest part of this form is the ‘@’ call. This is used for designating a function, so if our differential equation is represented by function fun then @fun tells Matlab to use the saved function fun. This is probably the way that Matlab separates a variable and function by syntax. There’s more to be said about this, if your interested skip down the page to Functions.

Functions

Functions in Matlab must be saved as .m files. This is a unique feature in Matlab and is mostly a consequence of the fact that Matlab is hardcore numerical. There are a couple of different kinds of functions; specifically I will only talk about anonymous and “regular” functions (sorry about that, I really did try to find out what Matlab called these things… I just didn’t see anything about it).

Regular functions are those which must be saved in .m files… which suggests that there are some functions that don’t have to be saved in .m files. Ok, so I lied… but for good reason: in ALMOST all cases functions must be saved as .m files and while it is inaccurate  to say ALL functions must be saved as .m files, there are VERY few cases where anonymous functions can be used. However, anonymous functions are a slick way to get something done quickly. For the differential equation:

\frac{dy_1}{dt} = -k*y_1

We would have to build a .m file for something very very simple. The .m file:

%Matlab function simple.m
function dy = simple(t,y)
k = 1;
dy = -k*y
end

The fact that you would have to generate this .m file for something so simple is kind of frustrating, huh? If you’re not convinced take a look at how much work you would have to go through to try and change your k for say 50 or 100 values of k. That’s a lot of clicks. Wouldn’t be great if there was something that allowed you to do all of this from the command prompt without the accessory .m file? Anonymous functions will let you. In the command prompt:

%command line
k = 1;
dy = @(t,y) -k*y;
ode23(dy, [0 20], 1)

Ok, that wasn’t that great. Let’s do something better:

%command line
for k=.5:.01:1.5
dy = @(t,y) -k*y;
ode23(dy, [0 20], 1)
hold on
end

Here’s what I got:

Ha… cool. That’s it.