IMPORTANT NOTE: This document assumes that you have installed and configured Microsoft HPC properly and that the compute nodes can access the headnode. If Microsoft HPC is not configured properly contact Microsoft for support before you attempt to install FLUENT. You can also download the Getting Started Guide for Windows HPC Server 2008 at http://technet.microsoft.com/en-us/library/cc793950.aspx. This guide is also available with the installation files for Microsoft® HPC Pack 2008 (HPCGettingStarted.rtf, in the root folder).
A client machine is any machine on the network that is not part of the cluster but can access the cluster through the network.
FLUENT using Hoops graphic libraries from TechSoft 3D and TechSoft3D does not support Remote Desktop Connection for remote viewing. Many times launching FLUENT using Remote Desktop Connection will not result in display issues if the HPC cluster headnode has a PCI Express x16 Graphics card installed with the latest graphics driver and if the client machines are using the latest Remote Desktop Client (version 6). If the above conditions cannot be met then you must launch FLUENT from a 64-bit client machine after installing Microsoft HPC Pack.
An alternative for sites that have not made 64-bit desktop workstations a standard is to purchase one such machine as a front end machine with Microsoft 2008 HPC Server installed. Users can then RDP insto this machine and launch FLUENT.
NOTE: This should work but ONLY if the machine you are running Remote Desktop Connection to has a high-end, certified graphics card with a certified driver.
If you are running via Remote Desktop Connection the default driver that FLUENT will use is the Microsoft Windows driver.
If you do run into graphics issues check which driver FLUENT is using.
fluent --fluent_options "-driver opengl"
NOTE: Compiling and Loading UDFs requires that you have a supported Microsoft Compiler installed. For more information about supported compilers visit this FAQ for more information. http://www.fluentusers.com/support/installation/winfaq/udf_par.htm
Though the usage described previously is recommended as an initial starting point for running FLUENT with the Microsoft Job Scheduler, there are further options provided to meet your specific needs. FLUENT allows you to do any of the following with the Microsoft Job Scheduler.
You can submit FLUENT jobs using the Microsoft Job Scheduler by using the -ccp flag in the FLUENT startup command.
Open up a command prompt and CD to the directory where your case and data file is located and type:
fluent 3d -ccp headnode –tnprocs
You can request resources but delay the launching of FLUENT until the actual resources are allocated.
This method must be used if you have 32-bit client machines and want to run the 64-bit version of FLUENT on a 64-bit Windows cluster. You will need to have a journal file in order to run in batch mode. To learn more about journal files see the section called “Journal Files”.
Share your working directory and make sure that you all nodes have “Full Control” permissions.
Open up a command prompt and browse to your working directory and type:
job submit /scheduler:head-node-name /numprocessors:4 /workdir:\\Working \\headnode\fluent\ntbin\win64\fluent.exe 3d -t4 -i journal_file.jou
Where headnode is the computer name of the headnode and numprocessors is the number of processors you want to allocate. (In the above example you have asked for 4 processors to be allocated).
Where 3d is the version of FLUENT you wish to use. You can choose 2d, 3d, 2ddp, or 3ddp, dp meaning double precision.
Where –t# has to be equal to or less than numprocessors:#
Where journal_file.jou is your actual journal file name.
You can use job templates to define sets of customized job submission policies. With job templates, administrators can effectively limit the types of jobs coming into the cluster, while also providing default values that aid users who are unfamiliar with the HPC Job Scheduler Service terminology. Because the job template provides default values, users can even submit jobs without specifying any job properties. For more information refer to the Getting Started Guide for Windows HPC Server 2008 at http://technet.microsoft.com/en-us/library/cc793950.aspx. This guide is also available with the installation files for Microsoft® HPC Pack 2008 (HPCGettingStarted.rtf, in the root folder).
NOTE: Job and Node templates do not work inside of Workbench. This issue has been fixed in R13.
You can specify which nodes, sockets or cores to run on by selecting the option on the Scheduler tab for Cores, Sockets, or Nodes. See the table below for examples.
|If you choose 2 Cores it will spawn 1 core on one node and 1 core on another node.|
|If you choose 2 Sockets it will spawn 2 cores on 1 node.|
|If you choose 2 Nodes it will spawn on ALL cores on 2 available nodes. NOTE: No other jobs or tasks can be started on that node, so it's quite similar to using the task Exclusive property.|
In the Environment tab in the FLUENT GUI add the following environment variable
CCP_NODES=%CCP_NODES% -cores # (where # is the number of cores you want to use on each node). For example, if you were to choose CCP_NODES=%CCP_NODES% -cores 4 it would spawn 4 cores on each of the nodes. NOTE: No other jobs or tasks can be started on the nodes, so it's quite similar to using the task Exclusive property.
In general, the rule is:
Use core allocation if the application is CPU intensive; the more processors you can throw at it the better! (FLUENT is CPU intensive)
Use socket allocation if memory access is what bottlenecks your application's performance. Since how much data can come in from memory is what limits the speed of the job, running more tasks on the same memory bus won't result in speed-up since all of those tasks are fighting over the path to memory.
Use node allocation if some node-wide resource is what bottlenecks your application. This is the case with applications that are relying heavily on access to disk or to networks resources. Running multiple tasks per node won't result in a speed-up since all of those tasks are waiting for access to the same disk or network pipe.
Some Key Facts
User Account Control (UAC) can help you prevent unauthorized changes to your computer. It works by prompting you for permission when a task requires administrative rights, such as installing software or changing settings that affect other users.
IPv6 is the latest address protocol that will eventually replace IPv4. From Windows Vista onward it has been kept enabled by default, but it is also a fact that IPv6 is not yet common and many software, routers, modems, and other network equipment do not support it yet including ANSYS. We recommend that you disable this protocol.
Please choose the link below from Microsoft's Web Site with Step by Step Instructions
Article ID: 929852 - Last Review: June 29, 2010 - Revision: 5.0.
How to disable certain Internet Protocol version 6 (IPv6) components in Windows Vista, Windows 7 and Windows Server 2008 http://support.microsoft.com/kb/929852
Problem Description: When trying to launch FLUENT from a Windows 7 client to a Windows 2008 HPC Server R2 you receive the following error in the FLUENT window:
'job' is not recognized as an internal or external command, operable program or batch file.
Description: Windows has a 260 character limit in the PATH variable. When installing Microsoft HPC 2008 R2 Client software it appends the bin path to the beginning of the PATH system variable. When the 260 character limit was reached it throws the following entry, namely the bin path of the HPC client software, in the FLUENT window. For more information about Windows file names and path limits please see: http://msdn.microsoft.com/en- us/library/aa365247%28VS.85% 29.aspx#maxpath
To verify if this is the root cause of the error thrown type: path in a Command Window. It should confirm that the output of path is missing the path to the HPC client bin directory.
Resolution: Trim the System PATH variable so that it does not exceed the 260 character Windows limit.
You launch FLUENT and the FLUENT window reports “Waiting for CCP Scheduler@headnode...” and hangs.
Resolution 1: The most likely reason FLUENT is hanging at this point is there is a username and/or password issue on any one of the compute nodes. The resolution is to clear the Cached Password and reset the password. If this is not the case look at Resolution 2.
Resolution 2: This behavior could also be caused by the order of the network bindings. Make sure that the Private NIC is listed first in Adapters and Bindings.
The public switch to the compute nodes does not need to be a high performance network. The head node will get a lot more traffic via the public network due to the file copies back and forth. For compute nodes, this will all occur on the private network. So put the Private NIC at the top. You will be able to control the MPI interface with the CCP_MPI_NETMASK command, for example, set Cluscfg CCP_MPI_NETMASK=IB_IP_ADDRESS/NETMASK.
Make sure that you have specified a “working” directory before launching FLUENT and that the directory is shared. This directory must also be mapped and you must use the mapped drive letter in the Working directory text box.
Check the order of the network bindings on the head node. The Private network binding should be the first in the list (on top). The preferred network binding order is Private, IB, MPI. Refer to the section, Preferred Network Binding Order for more information.
You launch FLUENT on a compute node or a client machine and it takes a long time to open up.
Resolution: Check the order of the NIC cards on all nodes on the cluster. The Private NIC should be listed first. See the section, "Preferred Network Binding Order"
If you are experiencing slow load times when reading or writing out the case and data file check the bandwidth between the host and node 0. The command would be typed in the FLUENT window: (bandwidth 0 999999 " ")
For optimal FLUENT performance low-latency is essential.
Bandwidth is a term used to describe the amount of data that can be transferred over a network cable or network device in a fixed amount of time. Bandwidth is measured in bits per second (bps) or in higher units like Mbps (millions of bits per second).
Latency refers to any of several kinds of delays typically incurred in processing of network data. A so-called low latency network connection is one that generally experiences small delay times, while a high latency connection generally suffers from long delays.
Once FLUENT is launched in the FLUENT window type:
(bandwidth # # "")
where the first # is would be the first node on the cluster 0 and the second # would be the last node on the cluster that you are using to run FLUENT. For example, (bandwidth 0 7 "") can be used to measure the bandwidth and latency between node 0 and 7 (8 cores).
You can expect to see similar results based on the table below. If you do not then run the bandwidth command again using 1 node at a time, for example, (bandwidth 0 1 ""), (bandwidth 0 2 ""), (bandwidth 0 3 ""), etc., until you can identify the node that is experiencing a network problem. Correct the problem and then run the bandwidth command again.
Millions of Bits Per Second
Type in any of the following cluscfg commands:
|cluscfg delcreds||Deletes the cached credentials for one or more specified users that the HPC Job Scheduler Service uses to submit jobs.|
|cluscfg listenvs||Displays the values of the cluster-wide environment variables.|
|cluscfg listparams||Displays the values of the cluster-wide parameters.|
|cluscfg setcreds||Sets the credentials to use for the specified user when submitting jobs, and stores the credentials in the credential cache.|
|cluscfg setenvs||Sets the values of one or more specified cluster-wide environment variables.|
|cluscfg setparams||Sets the values of one or more specified cluster-wide parameters.|
|cluscfg view||Displays statistics for the specified HPC cluster, including the name of the cluster, version of the HPC Pack installed on the cluster, and the number of nodes, cores, jobs, and tasks in various states.|
See Windows HPC Server 2008 Command Reference for more command line commands
A journal file contains a sequence of FLUENT commands, arranged as they would be typed interactively into the program or entered through the GUI. The GUI commands are recorded as Scheme code lines in journal files. FLUENT creates a journal file by recording everything you type on the command line or enter through the GUI. You can also create journal files manually with a text editor
The purpose of a journal file is to automate a series of commands instead of entering them repeatedly on the command line. Another use is to produce a record of the input to a program session for later reference, although transcript files are often more useful for this purpose.
Command input is taken from the specified journal file until its end is reached, at which time control is returned to the standard input (usually the keyboard). Each line from the journal file is echoed to the standard output (usually the screen) as it is read and processed. Refer to the FLUENT documentation for more information about how to create journal files.
A transcript file contains a complete record of all standard input to and output from FLUENT (usually all keyboard and GUI input and all screen output). GUI commands are recorded as Scheme code lines in transcript files. FLUENT creates a transcript file by recording everything typed as input or entered through the GUI, and everything printed as output in the text window.
The purpose of a transcript file is to produce a record of the program session for later reference. Because they contain messages and other output, transcript files (unlike journal files), cannot be read back into the program. Refer to the FLUENT documentation for more information about how to create transcript files.
1) We have a choice of 40 Gbit/s or 20Gbit/s infiniband. Running FLUENT can we take advantage of the 40Gbit/s speed on a cluster of 80 or less nodes or is that overkill. I have had some of our other software vendors tell me that 40Gbit is overkill for the amount of data they can move and 20Gbit is fine.
Comment: For jobs of 80 or less cores, 20 Gbit may be sufficient for FLUENT depending on the case size. But FLUENT scales well beyond 80 cores, where faster infiniband may help. In the future they may also want to run more cores. Depending on the price difference between 20/40Gbit/s infiniband, I'm not considering 40 Gbit/s as a overkiller for FLUENT, though we haven't done any quantitative comparison between these two speed of infiniband.
Updated (October 29, 2010)