Advanced search

Message boards : Wish list : Higher GPU utilization

Author Message
EmSti [BlackOps]
Send message
Joined: 21 Apr 12
Posts: 4
Credit: 37,802,925
RAC: 0
Level
Val
Scientific publications
watwatwatwatwat
Message 35215 - Posted: 22 Feb 2014 | 0:42:46 UTC

Ability to tune the app to meet the needs of faster GPUs inoder to fully utilize them. Some gpus I would want to push to the limit while still providing valid results, others I would want to be able to tune for good results without affecting the user experience while doing other activities on the machine (reduce visual lag).

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 35223 - Posted: 22 Feb 2014 | 11:24:30 UTC - in response to Message 35215.

To better utilize the GPU you can do the following:
Install the GPU in a Linux, XP or 2003R2server system. This will bring a 10 to 15% improvement.
Reduce your CPU usage. This can bring a ~5% improvement, but more if your setups bad.
Overclock. This is model specific and what you should OC depends on the GPU type. Some cards just won't OC but a ~5% gain is common and 10% is not uncommon.
Reduce the temperature of the GPU. Simply increasing fan speed or system temperature can improve the Boost (if your card boosts), and also prevents some downclocking. 1 to 3% improvement can be gained this way.

Unfortunately there are many task types at GPUGrid and each has it's different stability requirements and performance limits, so you can only be general about clock speeds.

On some cards you need to downclock to make it stable, but you can't expect to watch HD video and crunch on an underclocked GPU - you can't have your cake and eat it!

____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

EmSti [BlackOps]
Send message
Joined: 21 Apr 12
Posts: 4
Credit: 37,802,925
RAC: 0
Level
Val
Scientific publications
watwatwatwatwat
Message 35233 - Posted: 22 Feb 2014 | 20:53:07 UTC - in response to Message 35223.

While most of those replies do address how to make a wu run faster, they don't address the wish list request.

GPU clock and fan control are well known for controlling the speed of the wu on the gpu, but the advice doesn't address the wish to control % of GPU usage.

Running things on Windows 7/8 at higher % utilization on gtx 680 and up is not an impossible task, it is being done elsewhere. Getting more loading on a gtx 680 gpu is possible by running 2 wus with free cpus, but that still only gets it to 97%. Recommending people to move their GPUs to a different or older version of OSes is an interesting approach What is the percentage of the GPUGrid users using Windows 7/8?

Clock speed isn't what I was considering when I was wishing for utilization. My hope was the application might be able to allow users to tune the number of concurrent items the tasks send to the GPU or adjust threads to push more work at the GPUs that still have more to give. I understand the need to make sure default setting work on all/most supported GPUs, just wishing for a method to change the defaults. Maybe an optimized version of the application for the new architectures is needed.

On collatz, I like Slickers approach for those wus, I can push one gpu really hard, but tune the gpu engine I use on a daily basis back from the edge to prevent laggy performance on the screen. Just wishing for some kind of version of those features here. Of course I understand it may be different types of tuning for different types of application and Collatz and GPUGrid are different programming problems.

Sorry for the wordy response.

Dagorath
Send message
Joined: 16 Mar 11
Posts: 509
Credit: 179,005,236
RAC: 0
Level
Ile
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 35244 - Posted: 23 Feb 2014 | 3:21:02 UTC - in response to Message 35233.

While most of those replies do address how to make a wu run faster, they don't address the wish list request.


I can't speak with certainty for skgiven but I have a hunch he ignored (aka didn't address) your wish because it's already been wished for in a different thread(s) in a different section(s) and was addressed then. I have a hunch the admins and the developer will ignore your wish for that same reason. It's just IMHO, I could be wrong.

You can find those other threads and read them yourself if you wish but I think you'll find the following is an accurate summary... this is Hell and your wish is a snowball.

____________
BOINC <<--- credit whores, pedants, alien hunters

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 35266 - Posted: 23 Feb 2014 | 13:13:10 UTC - in response to Message 35244.
Last modified: 23 Feb 2014 | 13:21:22 UTC

GPUGrid runs complex molecular modelling simulations using CUDA based apps, not simple mathematical models, as is the case at other GPU projects. The tasks are long and intricate. They rely on recent drivers and system stability. Pushing cards to 99% often increases instability. Lots of people struggle to complete tasks that use 80% of the GPU. So running at 99% would cause more problems.

The root of the inefficiency with Vista and subsequent Microsoft operating systems is the operating systems WDDM models. While this improves stability somewhat for this project it is inherently less efficient than XP and Linux. GPUGrid cannot fix this.

Running 2 tasks means that 2 tasks are competing for the same resources. It’s folly to think that they would ever improve performance to 99% overall. Even if the GPU says it’s running at 99% that will not translate into task performances. The WDDM isn't just going to go away! At best you are going to squeeze out another 10% performance. However, when utilization is above 80% there is little or no gain. In fact in some cases the task completion rate is reduced. At present only GPU’s with 3GB of GDDR5 or more would complete more work by running two tasks. However the picture is foggier than that. Failure rates rise and then there are the issues with different task types; some tasks will be slightly faster when run together, some will be about the same overall, some will be slower overall, some will grind to a halt and some will cause system responsiveness issues or system failures.

The project management/maintenance side:
Since 2010 there has been 4 series of GeForce card; GF400, GF500, GK600 and GK700. Last week the first Maxwell’s arrived. Mealy catering for these GPU’s is a challenge. To get the most out of the new and best cards the team need to use new versions of CUDA and drivers. Sometimes there isn't anything in the new CUDA versions (it's not any faster), sometimes they are required just to support the card and the app, and sometimes new apps bring improvements, to all cards, some or a mixed bag; improves the newer cards at the expense of the older cards. The project requires constant development.

The project defines it’s research into the generic areas of Cancer, HIV/AIDS, Neural Disorders and Methods, however there can be several lines of research on-going within any of those categories at any one time. The WU types change frequently, so it would be infeasible to allow the selection of WU types; thousands of people would have to log in regularly and reconfigure their settings correctly. Support would become very difficult in what is already a difficult project to support. The projects research continuously changes.

Testing and creating new batches of tasks means that the project requires constant maintenance.

As suggested, all of the above has been discussed at length both in the forum and with the researchers with the intention of trying to make things better. Many suggestions have been taken up and implemented over the years but other suggestions have been deemed infeasible. Basically, it is the way it is because that’s the best that the GPUGrid team could do with the resources at hand, and their time limitations… Hence my suggestions!
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

mikey
Send message
Joined: 2 Jan 09
Posts: 297
Credit: 5,726,511,115
RAC: 29,991,333
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 35277 - Posted: 23 Feb 2014 | 14:56:58 UTC

And maybe that is the answer "The project defines it’s research into the generic areas of Cancer, HIV/AIDS, Neural Disorders and Methods, however there can be several lines of research on-going within any of those categories at any one time. The WU types change frequently, so it would be infeasible to allow the selection of WU types; thousands of people would have to log in regularly and reconfigure their settings correctly. Support would become very difficult in what is already a difficult project to support. The projects research continuously changes."

Give people the option of what kinds, generically speaking of course, to run, making them better able to tweak their own pc's. Instead of only giving them these choices:
Run only the selected applications ACEMD short runs (2-3 hours on fastest card): yes
ACEMD beta: no
ACEMD long runs (8-12 hours on fastest GPU): no
test app: no

Change that so it lists your generic types instead, then ALSO change this default:
If no work for selected applications is available, accept work from other applications? no

to YES so that if people run out of Cancer units they will automatically get the other kinds.

If people could stop getting wildly different types of units maybe they could tweak their pc's to be better for them, and ultimately you. Yes it may cause your stats to change, you could be out of Cancer units and have thousands of Hiv/Aids units left uncrunched, but if you play around with the credits slightly that too could change. ie as your internal deadlines/goals approach offer more credits for the units you want crunched more quickly. What I am trying to say is instead of using a take it or leave it approach, try a more let the user decide and work with them, via the carrot and the stick, to guide them down your yellow brick road to your goals.

Ultimately this could even benefit future workunits on your end as you find more people using this or that gpu on Cancer units you could tweak those units to best run on those gpu's. We all have reason why we crunch, loved ones lost to this or that disease, the idea of aliens coming, rocks hitting the Earth, etc, etc. Let us help you and you might be surprised where it can take the two of us.

You must already know about DistRTgen and the outrageous amounts of credits they give per workunit, and have decided not to even try to compete with them. People are here because they want to be, not just for the stats. Let's work together to solve your ultimate goals and just maybe we can both be a bit happier on the journey. MANY years ago Seti did the 'slam bam thank you mam' approach, and although they do still have the largest numbers of initial crunchers, they ALSO have the largest numbers for people leaving too. They too have now decided to be, at least slightly, more user friendly. They now shut down 'for maintenance' regularly, they have reportedly stopped reissuing workunits just to keep the work flowing, etc, etc.

Dagorath
Send message
Joined: 16 Mar 11
Posts: 509
Credit: 179,005,236
RAC: 0
Level
Ile
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 35286 - Posted: 23 Feb 2014 | 17:47:05 UTC - in response to Message 35277.

@skgiven,

Well said! That sums it up completely, coherently and concisely. An excellent piece.

@Mikey,

Good points regarding allowing volunteers to select the types of tasks. It's a good idea but would be difficult/impossible to implement in a way that doesn't require even more work from the admins/dev on a continuing basis. If you don't program then it's not easy to understand why it is that way but it is. Not saying I'm a programmer genius but I've been doing it long enough and have attempted such a variety of applications that I can say with confidence it wouldn't be feasible to do what you want the way you suggested.

Fortunately, there may be another way to do allow crunchers to choose which task types they run. I've floated the idea before and asked the admins for their thoughts on the matter and haven't received any reaction. So the next step is to force a reaction by implementing my idea. If it meets with approval then OK; if it's condemned by admins or users I'll disable it.

The plan is to provide a script that runs on our hosts, not the server. The script will allow users to "block" task types that do not run well on their host. It could also attempt an analysis of why tasks are failing and noffer suggestions on how to remedy the situation without blocking tasks. The problem is that the script can't actually block any task from being downloaded onto the host. It has to wait for whatever task comes down the pipe and abort it if it's on the user's blacklist. The potential problem is that if ba system aborts too many tasks in one day the server might refuse to send it more tasks for 24 hours. But what is the sense in accepting tasks that will likely error anyway?

I'll code it and test it. If it doesn't work out then there may be work arounds. For example if a user has 2 GPUs (say a 670 and a 780) and finds type A tasks runs OK on the 670 but not on the 780 then maybe the script could make sure type A tasks run only on the 670 and not on the 780. It may even be possible to direct task types to different hosts on the same LAN in case your 670 is in a different box than your 780, i.e. the old sneakernet concept but automated. And if it works across a LAN then why not over the WAN?

That's doable and it wouldn't require a lot of code maintenance on my part after the script is debugged and working properly.

____________
BOINC <<--- credit whores, pedants, alien hunters

EmSti [BlackOps]
Send message
Joined: 21 Apr 12
Posts: 4
Credit: 37,802,925
RAC: 0
Level
Val
Scientific publications
watwatwatwatwat
Message 35291 - Posted: 23 Feb 2014 | 21:06:59 UTC

Please keep in mind this is a wish list item, not I think you are doing a bad job and should change to suit my needs item. The initial reply was in a different direction than I anticipate with my intended wish, so I clarified at greater length (too much I think).

Variable
Send message
Joined: 20 Nov 13
Posts: 21
Credit: 439,298,605
RAC: 57,041
Level
Gln
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 35374 - Posted: 26 Feb 2014 | 22:44:15 UTC - in response to Message 35277.

Ultimately this could even benefit future workunits on your end as you find more people using this or that gpu on Cancer units you could tweak those units to best run on those gpu's. We all have reason why we crunch, loved ones lost to this or that disease, the idea of aliens coming, rocks hitting the Earth, etc, etc. Let us help you and you might be surprised where it can take the two of us.


I would like to add support for the idea of being able to weight or prioritize different kinds of tasks. I personally have had multiple family members with cancer so I tend to want to focus my resources on that particular issue. Having the option to select what kind of research you want to run is one of the things I like about the F@H client. Not sure if or how well it really works, but I like the idea a lot and would love to see it implemented here too.

Stefan
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 5 Mar 13
Posts: 348
Credit: 0
RAC: 0
Level

Scientific publications
wat
Message 35379 - Posted: 27 Feb 2014 | 9:24:24 UTC - in response to Message 35374.
Last modified: 27 Feb 2014 | 14:41:32 UTC

@Variable. I understand your point but selecting projects would be very counterproductive to do here unfortunately. We would cause ourselves more harm than gain.
We use priorities for our tasks which are chosen internally for each project based on the urgency of finishing it as well as the total amount of simulation time needed. If users could override this setting we would be essentially fighting against you.
Also users all moving priorities to a specific disease would eventually force us to change research direction even if we don't have anything to research in that direction which could be a horrible waste of resources.
I don't know how they do it at F@H but I would have my doubts about how seriously they take these priorities.

In any case, from my experience, batches of workunits are crunched-out on a monthly basis, meaning that all workunit batches we send out get returned quite fast. The difference between finishing simulation of a cancer project 3 weeks earlier than a methods project is not going to change anything really. Science (fortunately for the quality, unfortunately for the real world) moves slow which means that when our simulations finish within a month we need a good 6-12 months of analysis to get anything out of them.

@Dagorath: This could go either ok (if people use it as you suggest) or veeery badly. Even if it sounds a bit ugly, there are some times when there is a bit of a witch-hunt happening for specific WUs based on the reports of crunchers (NOELIAS for example). I am not saying that the user reports of failing WUs are wrong, but we get the overall statistics of failed WUs back from the server so we tend to know when the failure rate is especially high.* And trends seem to come and go with people hating one WU type and loving another for no statistically apparent reason. Allowing people to block specific WUs could create in such cases a bit of a problem. You are free to work on such a script but I have a feeling it won't be very liked by project scientists if it gets any momentum :P


* One thing we cannot see is for example if WUs caused crashes or reboots of your machine so this is always good to know so that we can try and fix it!

mikey
Send message
Joined: 2 Jan 09
Posts: 297
Credit: 5,726,511,115
RAC: 29,991,333
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 35383 - Posted: 27 Feb 2014 | 12:29:16 UTC - in response to Message 35379.

@Variable. I understand your point but selecting projects would be very counterproductive to do here unfortunately. We would cause ourselves more harm than gain.
We use priorities for our tasks which are chosen internally for each project based on the urgency of finishing it as well as the total amount of simulation time needed. If users could override this setting we would be essentially fighting against you.
Also users all moving priorities to a specific disease would eventually force us to change research direction even if we don't have anything to research in that direction which could be a horrible waste of resources.
I don't know how they do it at F@H but I would have my doubts about how seriously they take these priorities.


I don't know how F&H does it either but a simple 'if no units of the kind you select are available crunch all other kinds' check box would solve your problems. People would either crunch what they want and move elsewhere, or focus on what they want and then ALSO crunch the others when they are done. You sending the data forward after we are done with it is the key for us crunchers, we do NOT expect overnight results, most of us anyway. But if I can get x type of data to the next level quicker then I can feel like I am helping the memories of my lost loved ones more.

An analogy would be those that walk for MS, breast cancer, whatever...one weekend a year all those groups get out and do a concentrated effort, but all the rest of the year they do other things too.

Stefan
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 5 Mar 13
Posts: 348
Credit: 0
RAC: 0
Level

Scientific publications
wat
Message 35384 - Posted: 27 Feb 2014 | 14:48:42 UTC

I understand Mikey that people want to contribute to specific diseases. But the problem is that if for example we had a low-interest project in cancer (say verifying some results) and a super-important project on MS, would you still prefer to crunch the cancer WUs?
Or even simpler, no one's family is affected by "methods" disease. What will happen with the methods development that can improve all other research directions? It would get pushed back due to the user priorities and we would end up having to stop sending WU's for diseases to manage to get the methods WUs through (so that we can, in the end, speed up the disease WUs).
I personally don't think user preferences would work in such an environment.




Ps. As always, don't take my answers as official GPUGRID answers. Just providing the opinion of someone who's working on the inside.

Dagorath
Send message
Joined: 16 Mar 11
Posts: 509
Credit: 179,005,236
RAC: 0
Level
Ile
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 35385 - Posted: 27 Feb 2014 | 16:21:01 UTC - in response to Message 35384.

@Stefan

This thread has split into 2 different wishes:

1) a wish to have options in website preferences that would allow volunteers to decide how much to load their GPU, by "load" I mean how hard it works not which task type it works on

2) a wish to have options in website preferences that would allow volunteers to select which task type to work on

This thread started off with a wish for 1 and has now evolved into a discussion of wish 2, not that I am complaining about the change of direction, just attempting to clarify.

As per your advice in your previous post in this thread I will not, at this time, develop a script to allow users to filter out unwanted task types. I may at some time in the future but only if I can think of a way to prevent the problems it would cause. Thanks for your input on the idea. I never truly liked the idea due to the reasons you mentioned as well as the fact that some task types earn slightly more credit per hour than others and I can see how some volunteers would filter out low paying tasks in favor of the higher paying tasks. In the highly unlikely event I decide to pursue it again, I will inform project devs. The best way to solve the problem is proper host configuration which could mean adjusting parameters such as clocks and voltage on the fly via a clientside script. I might look into that possibility after I deliver on other commitments I've made regarding Crunchuntu.

____________
BOINC <<--- credit whores, pedants, alien hunters

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 35386 - Posted: 27 Feb 2014 | 17:40:59 UTC - in response to Message 35385.

Regarding Task Type Choice (2).

I understand and respect the wish of some people to crunch towards one type of research, say Cancer, HIV or Neural Disorders, but some people might not understand the importance of crunching to develop new Methods of research, which could be applied to many research areas, and indeed could replace some slow, expensive and inaccurate research with less expensive, faster and more accurate science methodologies.

Unfortunately the research group isn't large enough to facilitate Task Type choice. If GPUGrid grew, possibly by facilitating other research groups through associated research, this situation could change, but even then I suspect it might not be feasible.

Other 'legitimate' reasons for task type selection are choosing tasks that better utilize resources (do more work), and running tasks that don't fail on your system (occasionally, individual cards tend to fail one task type more than another).

It was definitely a big concern in the past that people would just crunch the best paying WU's, rather than what the research group needed the quickest, however we now have small badges for Scientific Publications. This means it's in badge collectors interests to crunch for every type of research. While I do think that some people would still go for the credits, others might specifically target the tasks with lower credits in order to get a higher colour publication badge. I would be concerned that small batches of research could be gobbled up by a few crunchers and some people could completely miss a badge (this has happened at other projects)!

If task choice was facilitated, the project administrators (as with other projects) could change the weighting of tasks types, prioritising tasks that the project needed fastest. While this would allow individuals to crunch specifically for their chosen research type, lets say Cancer, ultimately the project would still have overall control of the proportion of work that gets completed; those who chose 2 types of work (say Cancer and HIV) would get more of one type (HIV) than the other (Cancer). This would of course make the whole idea of task choice false. It would be an exercise in futility, and would probably annoy those who ended up with lots of one type rather than equal amounts (Brown and Blue badges rather than 2 Green badges). So I would prefer the researchers didn't wast their time.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Post to thread

Message boards : Wish list : Higher GPU utilization

//