Steve Myers: "There are fewer things that can go wrong now."
STEVE MYERS, the Director of Accelerators and Technology at CERN, the European Organisation of Nuclear Research, is the man of the moment in the physics world. For the Large Hadron Collider (LHC), the would-be most powerful particle accelerator in the world at CERN, Geneva, prepares itself for restart in the third week of November. The LHC was forced to shut down for repair following a major mishap nine days after its launch on September 10, 2008. Myers’ highest priority right now is to get the LHC running this year.
At CERN, he has worked on the Intersecting Storage Rings (ISR), the Large Electron-Positron Collider (LEP), the Super-Proton Synchrotron (SPS) and now the LHC. He joined CERN in 1972 as engineer-in-charge on the operation of the ISR. Since 1979, he has spent much of his career working on the LEP.
Excerpts from a detailed and frank interview he gave Frontline’s science correspondent, R. Ramachandran, at CERN:Dr Myers, how does the prospect of the LHC coming up soon look now?
At the moment the planning is that we try to inject the first beam in mid-November assuming, of course, that all things work according to schedule. Now that we are getting there, we don’t want to add on for things to go wrong. There are fewer things which can go wrong now. We understand the machine a lot better now than last year.But earlier this year you were talking of 4 TeV (tera electronvolt) per beam. Now suddenly you have reduced it to 3.5 TeV.
We have reduced it to 3.5 TeV for the initial part of the run but we do hope to go to 5 TeV in the later part. Certain measurements that we made in April-end and May-beginning this year led us to make many other measurements and some repairs. What we would like to do is get some experience with the machine at 3.5 TeV per beam before we move on to the next stage.
And 3.5 TeV is not what we are going to do for the whole year. We intend to go to 5 TeV per beam and eventually to 7 TeV per beam. But in the first stage we will go to 3.5 TeV and spend some time there. The experimenters have indicated that this is a very interesting time for them because it is about four times the energy of [Fermilab], the highest energy accelerator in the world.
The layers of a superconducting cable interconnect and the copper stabiliser solder sandwich.
And we can make lots of diagnostics and measurements to confirm everything that we have done so far and then move on to higher energy.You have also had some problems of helium leaks in two sectors.
Not a helium leak [but] it was a vacuum leak of helium from one place to another. When you say helium leak, people often think of September 19 last year when thousands of kilos of helium leaked. This is tiny, like a… [makes a hissing sound], which of course has the unfortunate consequence that the insulating vacuum isn’t as good as it should be and, therefore, you would not be able to run the magnets.
So we found out what that was. It was a fairly simple thing but a very complicated process. When we cool down certain parts of the machine we pump the helium at fairly high speeds, 50 m/s [metre/second], through flexible tubes. It was not known at that time – I don’t think anybody in the world knows this – that when you do this you provoke ultrasonic oscillations or vibrations in the tube and these oscillations cause tiny little pit holes, microscopic pit holes, which allow helium to escape. So we have taken out the flexible hose and replaced it with a solid one and we will also not push the liquid helium through at 50 m/s but bring it down to 25 m/s, and that won’t have many consequences. It is slow. Maybe we will lose a day for the cooldown but that’s no big deal.The main cause of the September incident was stated to be the bad splices between the dipole magnets, the interconnects, due to bad soldering which supposedly opened up causing an arc. Is that an established fact now?
It’s sort of 99.5 per cent certain. Of course, when you evaporate the thing which was the source of the problem, it is hard to be absolutely sure. So when you take all the other evidence, the measurements, and put them all together, we are almost 99.5 per cent certain that it was a bad solder. These solder sandwiches were there alright but there was only a tiny amount of solder that melted.
We also know that the 220 nano-Ohm resistance [of the superconducting cable joint] was something like a factor of 500 more than what it should have been (0.35 nano-Ohm). Of course, even if it is such a tiny resistance, the heat capacity in there is not fantastic because everything is at 1.9 K and if the temperature starts to move up to 4-5-6 K, the superconductor becomes normal and it is a thermal runaway.
A SCHEMATIC DIAGRAM of a normal operation with good interconnect (1.9K): the superconducting cable splice is good.
The resistance heating (current squared x resistance, I2R) with 3,000 amperes (A) current times the resistance, which is 500 times more than what it should be, is very high. What happened in this particular case was that it started a process and a little bit of solder that was there melted. Once it goes, first of all, there is so much inductance, which means very high energy, and secondly if you try to stop the current, you get an induced voltage because of the inductor.
So you have got all this energy, which in this case was hundreds of megajoules in the circuit, and what started as a small arc across the melted gap became bigger and bigger as things melted into something like lightning, which burnt the helium and then burnt the vacuum and it burnt everything. We have reconstructed that and we are 99.5 per cent sure that that was the cause of it.
The telling evidence was when we looked back at the data after the accident. For every single sector we had several tests at various currents, including calorimetry for heat measurements. The highest test was at 7,000 A where we sat for an hour. During this one hour [the sector 3-4] joint, which actually went, had a very anomalous temperature rise, at the rate of factor of thousand more than the others. This was clear evidence before a bad joint and there is all the evidence after the accident. So we are almost a 100 per cent sure. Nothing is ever a 100 per cent in this life. We have even got other cases where the resistance was 100 nano-Ohms.How did the projected summer restart of LHC get delayed to autumn end?
We talked about the 0.35 nano-Ohms. Now we have a fantastic system for measuring those tiny resistances. We will be able to measure that online almost continuously everyday across the machine with the new Quench Protection System (nQPS) that has been developed and put. The new detection system is a factor of 3,000 times more sensitive than the old one and this is just because we really needed to do this. Quite honestly, we didn’t realise that this could be a problem before. Now we know and now we can measure.
The other thing we came across during the tests in April-May – and that’s the other reason for the delay – is that even if you had a perfect 0.35 nano-Ohm joint in the superconducting cable, when you get a quench the superconducting cable becomes normal and it cannot carry any current because of its very high resistance. The current should then flow through the copper [in the solder sandwich] [see figure] – that is the whole purpose of this copper stabiliser – during the time you are decaying the current from the quench.
A good interconnect after quench (>10 K): During the current decay after the quench, the current flows through the copper stabiliser.
So quench happens, the current decays with a certain folding time and during that time this short-circuits the resistance in the superconducting cable. That’s perfect. In that case you count on the fact that the current can flow through this to the adjoining pieces. If the butt [the copper joint] here or anywhere else is not good, the resistance becomes quite high. So the current will [instead] flow through the superconducting cable and then melt the superconducting cable.
The Americans call this ‘the silent killer’, we call it the ‘copper stabiliser problem’ and we started to quantify joints like that someway. We devised a method over all octants of the machine, each just a little over 3 km, of measuring the resistances using the several voltage taps that we have on the magnets. The taps sometimes cover two or three of these joints depending on the configuration. We measured all those, every one of the 10,000 splices of the machine, and of course it meant measuring micro-Ohms. A good [stabiliser] joint should have a resistance, at warm, of something like 10 micro-Ohms for the dipoles.
The bad copper stabiliser joint after quench (>10 K): Current is now forced to flow through the superconducting cable, which is no longer superconducting after quench, causing it to heat up. The cable can suffer a thermal runawayand melt resulting in an arc across the splice gap. A good superconducting splice resistance is 0.35 nano-Ohm and good copper stabiliser joint resistance is 10-20 micro-Ohm.
The resistance measurement over a sector, because of noise, all sorts of other things and the number of joints there were between two voltage taps… our resolution was of the order of 25-30 micro-Ohms. So you could not see something which was 20 or 25 micro-Ohms but you could see those with 35, 40 or 50 micro-Ohms.
That’s what we did first and we used this as a sort of early warning system to identify problem cases. We then cut them open and used another short local resistance measurement with which we can measure down to 1 micro-Ohm, which is much better. In this way we confirmed that those measurements were correct.
The splash event recorded by the CMS experiment on November 7. The electromagnetic calorimeter is in red, the hadronic calorimeter in blue, and the muon system in yellow and magenta.
In the first sector, I think we found 10 joints that were suspect and when we did the accurate measurement they were identified as being anomalously high resistances. We have repaired in five of the sectors everything that is above 35 micro-Ohm, which is still a factor of three above what it should be. This is one of the reasons why we decided to be safe and go to 3.5 TeV first because there are simulations which show that at 3.5 we have a safety factor of more than 2.5. For 3.5 TeV, upwards a resistance of 120 micro-Ohms we would still be safe.
This is all under pessimistic input assumptions. We know that the maximum that we have measured in the five sectors is 53-54 micro-Ohms, in other words a factor of 2.5 safety. But, because you have measured in five sectors, it does not mean that there is not one very bad one in the others. You always have to deal with the worst case. You can’t use statistics in these. You only need one. So we have to have a safety factor. We have to take some risk. So this is why we are a little bit conservative.
Personally I think we could have gone to 5 TeV right away but there was a consensus decision that this would be 3.5 TeV. When we are only at 3.5 TeV, what we can do when we have quenches is, [since] we have all the diagnostics, see at what level the quench happened, what the resistance of the [superconducting bus bar] splice was because we know that very well to say that the simulation is very pessimistic. We have another factor of two safety there. Or if you find that the factor of two is not a factor of two then we just know the limit and we stay there.Before the start-up in September 2008, didn’t you have these threshold resistance values for the splices?
We had the specifications for the superconducting cable joint, not for the copper stabiliser. And the specification was 0.6 nano-Ohm for the external joints, and 1.2 for the internal joints. The 0.35 nano-Ohm is what we measured and that was always a little bit better than the specs.But we had cases much more than the specs – 220 nano-Ohms and 100 nano-Ohms… orders of magnitude more….
Yes. But they were not measured. They were the specs for the manufacturer. But in order to measure those you have to put every single magnet on the test bed and make lots and lots of measurements. At a certain point in the project we decided that we needed to measure the maximum fields of the magnet but we needed only to do a sort of spot checking of the order of 25 per cent of the magnets for field quality. And we should have measured these [resistance] specs during the field quality measurements but we didn’t. This was under the stress and pressure of getting ready on time and taking what we thought was a reasonable decision.How is the new QPS different from the one that already exists?
Okay. The first QPS was designed over a period of 10 years or so. So you have to be very careful that you don’t do more harm than good when you design a new one. So we decided that we will leave the existing QPS and this will be added on above this. We have the same functionality as the old system and we add this new system.
Basically what we have done is to measure the resistance across the superconducting cable joint. The measurement is continuously being monitored, and that’s how we know that we can measure resistances very well, and if the voltage increases above 0.3 milliVolt then the energy is extracted from the magnet into the dump resistors. Compared to the earlier threshold of 1 V, we now have 300 times the sensitivity. Besides its higher threshold, the earlier QPS had no voltage tap just across the joint between two superconducting dipole magnets. It was over the whole sector. Now the nQPS has it. It’s a bad name. It’s actually a magnet protection system which provokes a quench.Or an interconnect protection system…
In this case it is interconnect. We always had it before for the magnets. So, that allows us to basically have no worries about the contact between the superconducting cables. We will be able to measure every single one of them. We will be able to say there is a problem here, we stop and fix that, if we ever have one. What the system will also allow us to do is to see if there is any change of the resistance as a function of time.
So if we start with a very good machine and after two or three years the resistance starts to climb, you wouldn’t know what you are measuring and you would have another one of those accidents.
So I think we are very happy about that. We are developing a system where we can automatically measure this as well. We are not there yet. To do this accurately involves warming up all the sectors, which costs us a lot of time.
The new system that we are thinking of building may require warming one of the sectors to 15 or 20 K and pass a special pulsed current through and analyse. Maybe we will then be able to have a least resistance of 10-20 micro-Ohms in the copper stabilisers instead of 25-35 micro-Ohms as we have now.
The ones that we have repaired this year are absolutely spot on, the minimum that you can have for the dipole copper stabiliser is 9.75 micro-Ohms and the maximum ones we have seen is 11 micro-Ohms. We have repaired about 80. That means that it can be done and these can work. On the other hand, we need to gain some experience to see how these things will behave with time because there is no mechanical stability and everything relies on this solder.The 3.5 TeV corresponds to a current of about 6,600 A. Now the magnets are already there in the tunnel unlike earlier, when you had trained them to 13,000 A before installation. Are we sure that the magnets remain trained to 13,000?
No. We know that they are not. When we had the problem in September, this is what we were actually doing in Sector 5-6. We were starting to train the magnet up to 7 TeV. We noticed that some of the magnets which had been trained to more than 7 TeV before quenched at 5.8 TeV on the first warm up. That was quite strange. So we took Sector 5-6 and we started doing this training exercise and after something like 35-40 quenches we got to 6.8 TeV, which is a lot worse than what we thought.
We thought 1 or 2 quenches would be sufficient. But it seems as though there is some physics which caused the magnets to detrain. We haven’t really had the time to go into all that over the last year because of the repairs and the other work we had to do. We will start looking at that very seriously over the next few months. We know that we can go to 6.5 TeV relatively easily. So to get to the last we better understand what caused it.
We better understand for another reason too. Anytime we have a shut down and warm the magnets up, we want to know if they will lose their memory again because they have once lost their memory and that’s strange and on the other hand, for reasons of delay in the machine, and so on, these magnets have been left sitting in car parks for as much as three years. Nobody has ever done that to magnets before. It’s a big unknown. So we don’t know what to expect.Do you plan to do that for the entire machine sometime?
No, we will not do that. What we hope to do now is to get the machine running, get the physics and prepare for doing that next year, 2011, next run.You have a tight schedule right now. A slight delay, you are already into December and then Christmas...
Oh! We will do it. I hope there won’t be anymore delay. I hope to give the experimenters a Christmas present. If there is a week or two weeks’ delay then we will do the maximum we can and we will put the machine into pause mode for a few weeks because so many people have been working for such a long time and Christmas is the sort of time when it is very difficult to have everybody here and you can’t insist that everybody should be here. So we have decided that from December 19 to January 3, we will put the machine in the standby mode so that we come back quickly.So what are the lessons to be learnt from the whole thing that has happened from the perspective of a major international collaborative effort; in the sense of quality control of things that have been done outside and their testing, and so on?
I think the procedure was perfect; it was the implementation which was inadequate. The procedure for this, for example, called for outside companies to make soldering to the base, then for the superviser or the quality control person from the company to sign it off, then finally for the CERN person to check it and sign it off. So it should have been good work first, quality control second, and quality control third. Clearly, the work was bad, the first quality control was no good and our quality control also didn’t work. So that’s the only thing we can say. We can say only two things about Sector 3-4 too. It is a sector in the machine where, when we were tunneling, we hit an underwater lake. It’s a very damp part of the tunnel and in the first years of running the previous machine [LEP] there was water inundation there and we had to do all sorts of things. It is an area of the machine which is cold and damp.Was that the reason for it to be ramped up the last?
No. The tunnel should have been repaired after we finished with LEP. After we dismantled LEP, the plan was to fix the tunnel 3-4 before we installed LHC. For financial reasons and so on – I think the previous management at that time thought we could save the money – it was left as it was. There is no problem with the tunnel. It is just cold and damp.
Also at that time there were some problems with the evacuation alarms. We were getting spurious evacuation in Sector 3-4. Whether this was due to cold and damp [conditions] we do not know. And, of course, the poor guys in the tunnel, in the cold and damp and this alarm goes off, they have to run, go upstairs, and come back some time later. Now, whether they came back and started on the next job without finishing the previous ones or not, it’s extremely difficult to know what happened. But it is a possibility that it was related to the whole sequence of events.
The 53rd and final magnet for the Sector 3-4 repairs being lowered into the tunnel on April 30, marking the end of repair work above ground.
Nevertheless, the second and the third checks mustn’t have been done because when it went through afterwards everything should have been okay. So the procedure was correct; the implementation may not have been correct. We had a similar thing with other companies. We had prepared a checklist for them for quality control. They had ticked everything off but it was clear that they hadn’t done anything. This is, I think, a lesson to be learnt…Some parts have been supplied by India, notably the anchoring jacks and the QPS power supplies. Have there been problems with any of them?
No. They were good. QPS power supplies worked beautifully. The anchoring…we have forced the anchoring. For the same reason that we did not expect such a high burst of helium, we did not expect that there would be so much pressure on these. How they are designed so that they can take contraction and pull and push. The other problem was that a lot of the anchors were ripped apart because the concrete of the old tunnel from LEP days didn’t hold it. But from the Indian perspective, all of them have been of good quality. And I was almost personally responsible for the QPS power supplies. We were glad that we got such a good price and such a good quality.
(Letters to the Editor should carry the full postal address)
Home | The Hindu | Business Line | Sportstar | Publications | eBooks | Images
Copyright © 2009, Frontline.
Republication or redissemination of the contents of this screen are expressly prohibited
without the written consent of Frontline