Wine 3D performance - where does the bottleneck lies and how

Open forum for end-user questions about Wine. Before asking questions, check out the Wiki as a first step.
Forum Rules
Locked
James Huk

Wine 3D performance - where does the bottleneck lies and how

Post by James Huk »

Hello everyone.

I was wondering, why most D3D games works much slower on wine then on
Windows? Ok now, I understand that some D3D functions needs clever
workarounds to work on OpenGL and I'm not complaining, I just want to
know how much more optimize do you think wine will become? I'm asking
all this because I'm thinking of something like this:

I have game X, this game use D3D8/9 and is working perfectly but slow,
since it use D3D8/9 I can test WineD3D on Windows - and I can tell
there is no difference there, so the slowness is only related to
WineD3D, not the wine core. Now I would like to optimize the part of
wine this particular game use. I was thinking to do the following:

1.Find out all D3D functions that this game use - shouldn't be too
hard with WINEDEBUG=+all or WINEDEBUG=+d3d (I think).
2.Write simple programs to test speed of those functions on native D3D
on Windows, and on WineD3D.
3.Then try to write code to make each function faster (jest I know,
this sound so "easy" when I say it here, and is pretty damn hard in
the real world)

Now the problem is... if the wine code is already optimized to the
limit of the high-level programming language, then there is no point,
unless I would learn x86 Assembler - and use it to optimize outside
the compiler - but this is not gonna happen anytime soon :( . But if
the code is written with stability in mind rather then performance,
then maybe I can do something to speed it up - even if only for a
single game.

So do you think there is a point in trying something like this?

Thanks in advance and sorry for my English.
oiaohm
Level 8
Level 8
Posts: 1020
Joined: Fri Feb 29, 2008 2:54 am

Post by oiaohm »

James Huk I am sorry but there is a limit to what can be done to WineD3D

Mapping from Direct X to Opengl is never going to be a 100 percent clean process.

This branch of gallium is working on a different solution to problem. http://cgit.freedesktop.org/~jsindholt/ ... llium-nine ie avoid opengl get closer to the hardware itself.

There is still a lot of debuging code and the like in WineD3D. All those printing of fixmes cost a lot of time. But even disabled they cost a little bit of time.

Optimizing by assembler is not worth doing. Yes a lot of people will foolishly say it can help. Reason it does not help is each x86 cpu these days even from the same batch takes different ammounts of time to process the same x86 instructions. So two x86 processes same batch can require two completely different asm's to perform at best. Also to make it worse even the same instructions sent into the cpu at different times will take different times to run. Days of super fast asm are over. Lot of people who do that these days ends up with super snail code and cannot work out why.

Reason why some developers near gcc are developing this http://ctuning.org/wiki/index.php/CTools:MilepostGCC so that complier can adjust to the cpu it has at hand.

Currently gcc is very poor optimizer. The upcoming gcc 4.5.0 will help a bit.

Current gcc optimization only happens with the .c to .o conversion happens. Does not happen at conversion from .o to final binary. There are most likely areas in the wined3d that can optimize out that are not.

gcc 4.5.0 has a link time optimisation option that does .o to final binary optimisation.

Problem here is lets say you do optimize them out by hand at this stage you will make maintenance of code harder.

Basically lots of different issues causing it speed problems. Some inside wine control some not.
James Huk

Wine 3D performance - where does the bottleneck lies and how

Post by James Huk »

On Sun, Feb 7, 2010 at 2:49 AM, oiaohm <[email protected]> wrote:
James Huk  I am sorry but there is a limit to what can be done to WineD3D

Mapping from Direct X to Opengl is never going to be a 100 percent clean process.

This branch of gallium is working on a different solution to problem.  http://cgit.freedesktop.org/~jsindholt/ ... llium-nine ie avoid opengl get closer to the hardware itself.

There is still a lot of debuging code and the like in WineD3D.  All those printing of fixmes cost a lot of time.   But even disabled they cost a little bit of time.

Optimizing by assembler is not worth doing.  Yes a lot of people will foolishly say it can help.  Reason it does not help is each x86 cpu these days even from the same batch takes different ammounts of time to process the same x86 instructions.   So two x86 processes same batch can require two completely different asm's to perform at best.   Also to make it worse even the same instructions sent into the cpu at different times will take different times to run.  Days of super fast asm are over. Lot of people who do that these days ends up with super snail code and cannot work out why.

Reason why some developers near gcc are developing this http://ctuning.org/wiki/index.php/CTools:MilepostGCC so that complier can adjust to the cpu it has at hand.

Currently gcc is very poor optimizer.   The upcoming gcc 4.5.0 will help a bit.

Current gcc optimization only happens with the .c to .o conversion happens.   Does not happen at conversion from .o to final binary.   There are most likely areas in the wined3d that can optimize out that are not.

gcc 4.5.0 has a link time optimisation option that does .o to final binary optimisation.

Problem here is lets say you do optimize them out by hand at this stage you will make maintenance of code harder.

Basically lots of different issues causing it speed problems.   Some inside wine control some not.





Thank you for the answer.

Of course I understand that there will always be overhead because of
D3D to OpenGL conversion - however, there are some games that works
significantly faster on Cedega for example - one of such games was
Rally Championship 2000 - (was because now it doesn't work on Cedega
at all - typical)
while driving it was 5 times faster on Cedega(practically Windows
speed) then on wine, even thou both wrappers run it perfectly, so I
guess there is a way to optimize wine code a bit more. Also please
keep in mind - I'm talking here about program specific optimizations
rather then generic optimizations, also I think that if the game runs
50% slower then on Windows then there is a room for improvement (at
leas I hope there is)?
Thunderbird
Level 5
Level 5
Posts: 336
Joined: Mon Nov 24, 2008 8:10 am

Post by Thunderbird »

The performance of WineD3D depends on the application and display drivers. It is hard to say where we are losing the performance and whether it is WineD3D which has to be blamed.

To find out where the performance is lost you need to profile the app using e.g. sysprof or oprofile. Then you can see where the time is spent. Typically we don't spend a lot of time in Wine code but outside Wine in the display drivers. Some drivers don't offer good implementations for all OpenGL calls which causes software emulation. It can also be that the drivers have to to convert data to a different format before they are upload to the GPU. It can also be that we use OpenGL in an inefficient way for this app. It is very hard to figure out good optimizations.
oiaohm
Level 8
Level 8
Posts: 1020
Joined: Fri Feb 29, 2008 2:54 am

Post by oiaohm »

James Huk you will also notice at times Cedega is fast than wine on some games but other games using the same functions don't work at all under Cedega but run perfectly under wine.

Ie poorer implementation with less checks for features runs faster. Sorry but that is the way it is at times. Its very hard to compare two different implementations complete to different levels and forecast anything.

There are lot of different things that still can be done. James Huk so there is hope.

Wine does not doing application specific optimizations in the code base any more(over 8 years go that was no more). It makes maintenance of code insanely hard. Either implement right or not at all.

They may be some place for run game profiling wine then rebuilt wine with code in different function layout by compiler to suit game. So leaving the wine source base alone. These kind of experiments have not been tried with wine as far as I know. Of course that is only going to have advantages on a application by application base. Maybe some functions have order of check for operations wrong.

Basically without collected data changing the perminte order of checks in the main wine source code would be reckless.

Thunderbird question do we need something like milepost for opengl/direct x that can rebuild code different ways depending on how smart or how dumb the video card is?
Thunderbird
Level 5
Level 5
Posts: 336
Joined: Mon Nov 24, 2008 8:10 am

Post by Thunderbird »

At this point we don't need any application specific optimizations. There can be various areas which have not been implemented in the most efficient way. Those issues should be fixed.
James Huk

Wine 3D performance - where does the bottleneck lies and how

Post by James Huk »

On Sun, Feb 7, 2010 at 4:10 PM, Thunderbird <[email protected]> wrote:
At this point we don't need any application specific optimizations. There can be various areas which have not been implemented in the most efficient way. Those issues should be fixed.





Thanks for the information everyone - that cleared thing up a bit. One
more thing I was wondering... do you know if somebody ever tried
compiling wine using Intel C Compiler (presumably - the fastest x86
compiler there is)? Should it be possible to compile wine using this
compiler "out of the box" or would some changes to the code need to be
done first?
David Gerard

Wine 3D performance - where does the bottleneck lies and how

Post by David Gerard »

On 9 February 2010 00:12, James Huk <[email protected]> wrote:
Thanks for the information everyone - that cleared thing up a bit. One
more thing I was wondering... do you know if somebody ever tried
compiling wine using Intel C Compiler (presumably - the fastest x86
compiler there is)? Should it be possible to compile wine using this
compiler "out of the box" or would some changes to the code need to be
done first?
http://wiki.winehq.org/icc - some experiments along these lines. Not
sure anyone's actively working to keep Wine compilable in icc, let
alone doing performance testing. Basically, give it a try and you'll
be helping keep the Wine codebase robust and cross-compiler compliant!


- d.
James Huk

Wine 3D performance - where does the bottleneck lies and how

Post by James Huk »

---------- Forwarded message ----------
From: James Huk <[email protected]>
Date: Fri, Feb 12, 2010 at 4:26 AM
Subject: Re: [Wine] Wine 3D performance - where does the bottleneck lies and how
To: David Gerard <[email protected]>


On Tue, Feb 9, 2010 at 1:28 AM, David Gerard <[email protected]> wrote:
On 9 February 2010 00:12, James Huk <[email protected]> wrote:
Thanks for the information everyone - that cleared thing up a bit. One
more thing I was wondering... do you know if somebody ever tried
compiling wine using Intel C Compiler (presumably - the fastest x86
compiler there is)? Should it be possible to compile wine using this
compiler "out of the box" or would some changes to the code need to be
done first?
http://wiki.winehq.org/icc - some experiments along these lines. Not
sure anyone's actively working to keep Wine compilable in icc, let
alone doing performance testing. Basically, give it a try and you'll
be helping keep the Wine codebase robust and cross-compiler compliant!


- d.
Ok, i managed to compile it, well most of it anyway - some test
failed, or at least I think that's what is in the log. Also dxgi
(whatever that is) failed to compile, and no programs were build, I
had to go to the programs dir and type "make" there - then they
compiled without problem.
Hopefully nothing else is missing. And holy hell! Wine source after
compilation has 17.7 GB!! With GCC it usually have a bit more then 1GB
- 16GB difference is a bit weird.

As for speed - I will try to test that tomorrow (sorry, it is 4.23 AM
in here ;] ), but I already managed to run Operation Flashpoint and I
must say, I don't see any difference in speed, however more apps need
to be tested. I will try some 3d marks tomorrow, anything else you
would recommend for speed tests?

Compilation log can be found here:

http://wine.x.pl/wine-1.1.38-ICC-compilation.log.tar.gz
Locked