ANSYS FLUENT Installation FAQs

 Installing and Configuring ANSYS FLUENT Network Parallel - HPC

IMPORTANT NOTE: This document assumes that you have installed and configured Microsoft HPC properly and that the compute nodes can access the headnode. If Microsoft HPC is not configured properly contact Microsoft for support before you attempt to install FLUENT. You can also download the Getting Started Guide for Windows HPC Server 2008 at http://technet.microsoft.com/en-us/library/cc793950.aspx. This guide is also available with the installation files for Microsoft® HPC Pack 2008 (HPCGettingStarted.rtf, in the root folder).

  1. Install FLUENT (it is only necessary to install FLUENT on the Headnode). The new directory structure will install FLUENT in C:\Program Files\ANSYS Inc\v121\fluent.
  2. Share the Fluent directory that sits under C:\Program Files\ANSYS Inc\v121\so that all computers on the cluster can access this shared directory through the network.
  3. Setup your FLUENT working directory as a Shared Network Drive. (A working directory is the directory where your case and data files reside)
     Windows 2008 Server
    • Go to the Start Menu, Computer, and select Map Network Drive from the menu near the top of the screen.
  1. Select a Drive letter and then press Browse... to your working directory, for example, C:\Working

 

 Configuring HPC Clients

A client machine is any machine on the network that is not part of the cluster but can access the cluster through the network.

 Requirements

  1. Client machines must be running Microsoft Windows XP 64-bit, Windows Vista 64-bit, Windows 7 64-bit or Microsoft 2008 Server 64-bit.
  2. If you only have 32-bit clients then you can run FLUENT on the HPC cluster but you must run in batch mode using a journal file.
  3. Client machines must have a have a high-end video card with the latest graphics driver from the vendor installed. For a list of certified video cards send an email to diana.collier@ansys.com.

 Remote Desktop Connection Support

FLUENT using Hoops graphic libraries from TechSoft 3D and TechSoft3D does not support Remote Desktop Connection for remote viewing. Many times launching FLUENT using Remote Desktop Connection will not result in display issues if the HPC cluster headnode has a PCI Express x16 Graphics card installed with the latest graphics driver and if the client machines are using the latest Remote Desktop Client (version 6). If the above conditions cannot be met then you must launch FLUENT from a 64-bit client machine after installing Microsoft HPC Pack.

An alternative for sites that have not made 64-bit desktop workstations a standard is to purchase one such machine as a front end machine with Microsoft 2008 HPC Server installed. Users can then RDP insto this machine and launch FLUENT.

 FLUENT 12 Startup Command

NOTE: This should work but ONLY if the machine you are running Remote Desktop Connection to has a high-end, certified graphics card with a certified driver.

If you are running via Remote Desktop Connection the default driver that FLUENT will use is the Microsoft Windows driver.

If you do run into graphics issues check which driver FLUENT is using.

  1. Open up FLUENT
  2. Choose the Help menu, Version
  3. Check the Graphics Version. If it reads msw/win you are using the operating system Windows driver.

  1. Try using the OpenGL driver by starting FLUENT with the following flag:
    fluent --fluent_options "-driver opengl"
  2. Check the Graphics Version once FLUENT is launched to verify that it is using the OpenGL driver.

 Changing the DISPLAY to the Classic FLUENT graphics Display

  1. Open FLUENT
  2. Choose the Display Menu
  3. Choose Options
  4. In the Color Scheme drop down choose Classic.

Configuring HPC Client Machines to Access the Cluster

  1. Install the HPC Pack 2008 on the client machines from the headnode network share called C:\Program Files\Microsoft HPC Pack\Data\reminst or from the CD
  2. Open up the reminst directory and double-click on the setup.exe file. There could be some additional programs required, for example, .NET Framework, etc. The installer will prompt you to install these programs.

Options with FLUENT

 Launching FLUENT and Selecting Options

  1. It is easiest to make a shortcut to the fluent.exe file from the headnode onto the client machines. The fluent.exe is located in C:\Program Files\ANSYS Inc\v121\fluent\ntbin\win64.
  2. Launch FLUENT from the shortcut on your desktop.
  3. Choose your Dimension, Display Options. Under Options choose Double Precision if necessary. Choose Use Microsoft Job Scheduler.
  4. Under Processing Options choose Parallel per MS Job Scheduler and then enter the Number or Processes you will be using.

  1. Select Show More >>

  1. Select the Parallel Settings tab.
  2. Select the appropriate options under Interconnects and MPI Types (Ethernet and msmpi are the defaults when running on 2008 HPC Server)

  1. Choose the Scheduler tab and type in the name of the headnode.

  1. If you will be compiling and loading User Defined Functions (UDFs) then choose the UDF Compiler tab and check the box Setup Compilation Environment for UDF.

NOTE: Compiling and Loading UDFs requires that you have a supported Microsoft Compiler installed. For more information about supported compilers visit this FAQ for more information. http://www.fluentusers.com/support/installation/winfaq/udf_par.htm

  1. Press OK to launch FLUENT.

 Launching FLUENT from the Command Line

Though the usage described previously is recommended as an initial starting point for running FLUENT with the Microsoft Job Scheduler, there are further options provided to meet your specific needs. FLUENT allows you to do any of the following with the Microsoft Job Scheduler.

You can submit FLUENT jobs using the Microsoft Job Scheduler by using the -ccp flag in the FLUENT startup command.

Open up a command prompt and CD to the directory where your case and data file is located and type:

fluent 3d -ccp headnode –tnprocs

 Request Resources from the Job Scheduler Before you Launch FLUENT

You can request resources but delay the launching of FLUENT until the actual resources are allocated.

  1. In the FLUENT GUI, choose the Scheduler tab then check "Start When Resources are Available"

 Run a FLUENT Job in Batch Mode (required if your client machine is 32-bit)

This method must be used if you have 32-bit client machines and want to run the 64-bit version of FLUENT on a 64-bit Windows cluster. You will need to have a journal file in order to run in batch mode. To learn more about journal files see the section called “Journal Files”.

Share your working directory and make sure that you all nodes have “Full Control” permissions.

Open up a command prompt and browse to your working directory and type:

job submit /scheduler:head-node-name /numprocessors:4 /workdir:\\Working \\headnode\fluent\ntbin\win64\fluent.exe 3d -t4 -i journal_file.jou

Where headnode is the computer name of the headnode and numprocessors is the number of processors you want to allocate. (In the above example you have asked for 4 processors to be allocated).

Where 3d is the version of FLUENT you wish to use. You can choose 2d, 3d, 2ddp, or 3ddp, dp meaning double precision.

Where –t# has to be equal to or less than numprocessors:#

Where journal_file.jou is your actual journal file name.

 Job Templates

You can use job templates to define sets of customized job submission policies. With job templates, administrators can effectively limit the types of jobs coming into the cluster, while also providing default values that aid users who are unfamiliar with the HPC Job Scheduler Service terminology. Because the job template provides default values, users can even submit jobs without specifying any job properties. For more information refer to the Getting Started Guide for Windows HPC Server 2008 at http://technet.microsoft.com/en-us/library/cc793950.aspx. This guide is also available with the installation files for Microsoft® HPC Pack 2008 (HPCGettingStarted.rtf, in the root folder).

NOTE: Job and Node templates do not work inside of Workbench. This issue has been fixed in R13.

Cores, Sockets and Nodes

 What are cores, sockets of nodes?

  1. Core: Refers to a single processing unit capable of performing computations.  A core is the smallest unit of allocation available in HPC Server 2008.
  2. Socket: Refers to collection of cores with a direct pipe to memory.  Each socket contains 1 or more cores.  Note that this does not necessarily refer to a physical socket, but rather to the memory architecture of the machine, which will depend on your chip vendor.
  3. Node: Refers to an entire compute node.  Each node contains 1 or more sockets.

  Specifying which nodes/sockets/cores for FLUENT runs?

You can specify which nodes, sockets or cores to run on by selecting the option on the Scheduler tab for Cores, Sockets, or Nodes. See the table below for examples.    

Cores
If you choose 2 Cores it will spawn 1 core on one node and 1 core on another node.
Sockets
If you choose 2 Sockets it will spawn 2 cores on 1 node.
Nodes
If you choose 2 Nodes it will spawn on ALL cores on 2 available nodes. NOTE: No other jobs or tasks can be started on that node, so it's quite similar to using the task Exclusive property.

 Specifying Specific Cores on Nodes

In the Environment tab in the FLUENT GUI add the following environment variable

CCP_NODES=%CCP_NODES% -cores # (where # is the number of cores you want to use on each node). For example, if you were to choose CCP_NODES=%CCP_NODES% -cores 4 it would spawn 4 cores on each of the nodes. NOTE: No other jobs or tasks can be started on the nodes, so it's quite similar to using the task Exclusive property.

 When to Specify Cores, Sockets of Nodes

In general, the rule is:

Use core allocation if the application is CPU intensive; the more processors you can throw at it the better! (FLUENT is CPU intensive)

Use socket allocation if memory access is what bottlenecks your application's performance.  Since how much data can come in from memory is what limits the speed of the job, running more tasks on the same memory bus won't result in speed-up since all of those tasks are fighting over the path to memory.

Use node allocation if some node-wide resource is what bottlenecks your application.  This is the case with applications that are relying heavily on access to disk or to networks resources.  Running multiple tasks per node won't result in a speed-up since all of those tasks are waiting for access to the same disk or network pipe.

Some Key Facts

  1. The unit type set on your job also applies to all tasks in that job (i.e. you can't have a job requesting 4 nodes with a bunch of tasks requesting 2 cores each).
  2. You can still use batch scripts or your applications mechanisms to launch multiple threads or processes on the resources that your job is allocated.
  3. By using these correctly, you can improve your cluster utilization since jobs are more likely to get only the resources they need. 

Troubleshooting FLUENT/HPC Issues

Windows 7 Specific Issues

Turning User Account Control on or off

User Account Control (UAC) can help you prevent unauthorized changes to your computer. It works by prompting you for permission when a task requires administrative rights, such as installing software or changing settings that affect other users.

  1. Open User Accounts by clicking the Start button, clicking Control Panel , clicking User Accounts and Family Safety (or clicking User Accounts, if you are connected to a network domain), and then clicking User Accounts.
  2. Click Turn User Account Control on or off.  If you are prompted for an administrator password or confirmation, type the password or provide confirmation.

Disable IPv6

IPv6 is the latest address protocol that will eventually replace IPv4. From Windows Vista onward it has been kept enabled by default, but it is also a fact that IPv6 is not yet common and many software, routers, modems, and other network equipment do not support it yet including ANSYS. We recommend that you disable this protocol.

Please choose the link below from Microsoft's Web Site with Step by Step Instructions

Article ID: 929852 - Last Review: June 29, 2010 - Revision: 5.0.

How to disable certain Internet Protocol version 6 (IPv6) components in Windows Vista, Windows 7 and Windows Server 2008 http://support.microsoft.com/kb/929852

Error: 'job' is not recognized as an internal or external command, operable program or batch file.

Problem Description: When trying to launch FLUENT from a Windows 7 client to a Windows 2008 HPC Server R2 you receive the following error in the FLUENT window:

'job' is not recognized as an internal or external command, operable program or batch file.

Description: Windows has a 260 character limit in the PATH variable. When installing Microsoft HPC 2008 R2 Client software it appends the bin path to the beginning of the PATH system variable. When the 260 character limit was reached it throws the following entry, namely the bin path of the HPC client software, in the FLUENT window. For more information about Windows file names and path limits please see: http://msdn.microsoft.com/en- us/library/aa365247%28VS.85% 29.aspx#maxpath

To verify if this is the root cause of the error thrown type: path in a Command Window. It should confirm that the output of path is missing the path to the HPC client bin directory.

Resolution: Trim the System PATH variable so that it does not exceed the 260 character Windows limit.

 FLUENT hangs when launching

You launch FLUENT and the FLUENT window reports “Waiting for CCP Scheduler@headnode...” and hangs. 

Resolution 1: The most likely reason FLUENT is hanging at this point is there is a username and/or password issue on any one of the compute nodes. The resolution is to clear the Cached Password and reset the password. If this is not the case look at Resolution 2.

 Clearing the Cached Password

  1. Open up the HPC Job Manager.
  2. Open the Options Menu.
  3. Choose "Clear Cached Job Credentials"

 Resetting the Password

  1. Open up the HPC Job Manager
  2. Open up the Actions menu, Job Submission, New Job
  3. Choose Task List in the left panel
  4. In the Command Line box type: cmd.exe
  5. Choose Save, then Submit.
  6. You will be prompted to enter and save your password.
  7. Restart FLUENT

Resolution 2: This behavior could also be caused by the order of the network bindings. Make sure that the Private NIC is listed first in Adapters and Bindings.

 Preferred Network Binding Order

The public switch to the compute nodes does not need to be a high performance network. The head node will get a lot more traffic via the public network due to the file copies back and forth. For compute nodes, this will all occur on the private network. So put the Private NIC at the top. You will be able to control the MPI interface with the CCP_MPI_NETMASK command, for example, set Cluscfg CCP_MPI_NETMASK=IB_IP_ADDRESS/NETMASK.

  1. Open Control Panel, Network and Sharing Center, Manage Network Connections, ON the Menu bar choose the Advanced menu, Advanced Settings, Adapters and Bindings tab. If the Private NIC is not listed first move it up to the first position. Do this on ALL compute nodes on the cluster.

 A FLUENT Job Seems to Run indefinitely within the HPC Job Scheduler when running in batch mode

Make sure that you have specified a “working” directory before launching FLUENT and that the directory is shared. This directory must also be mapped and you must use the mapped drive letter in the Working directory text box.

 Error when trying to write out a FLUENT Case or Data file

Check the order of the network bindings on the head node. The Private network binding should be the first in the list (on top). The preferred network binding order is Private, IB, MPI. Refer to the section, Preferred Network Binding Order for more information.

 Launching FLUENT from a compute node starting up slow

You launch FLUENT on a compute node or a client machine and it takes a long time to open up.

Resolution: Check the order of the NIC cards on all nodes on the cluster. The Private NIC should be listed first. See the section, "Preferred Network Binding Order"

 Slow Read or Write Times with Large Case and Data Files

If you are experiencing slow load times when reading or writing out the case and data file check the bandwidth between the host and node 0. The command would be typed in the FLUENT window: (bandwidth 0 999999 " ")

Parallel Performance Issues

For optimal FLUENT performance low-latency is essential.

 Measuring Bandwidth and Latency between Nodes

Bandwidth is a term used to describe the amount of data that can be transferred over a network cable or network device in a fixed amount of time. Bandwidth is measured in bits per second (bps) or in higher units like Mbps (millions of bits per second).

Latency refers to any of several kinds of delays typically incurred in processing of network data. A so-called low latency network connection is one that generally experiences small delay times, while a high latency connection generally suffers from long delays.

Once FLUENT is launched in the FLUENT window type:

(bandwidth # # "")

where the first # is would be the first node on the cluster 0 and the second # would be the last node on the cluster that you are using to run FLUENT. For example, (bandwidth 0 7 "") can be used to measure the bandwidth and latency between node 0 and 7 (8 cores).

You can expect to see similar results based on the table below. If you do not then run the bandwidth command again using 1 node at a time, for example, (bandwidth 0 1 ""), (bandwidth 0 2 ""), (bandwidth 0 3 ""), etc., until you can identify the node that is experiencing a network problem. Correct the problem and then run the bandwidth command again.

Interconnect Latency
Microseconds (MS)
Bandwidth
Millions of Bits Per Second
(MBPS)
GigE 50 100
Infiniband    
    Winsock Direct 15 800
    TCP/IP 45 200
    IBAL 5 1000

 Setting and Checking HPC Environment Variable using Cluscfg

Type in any of the following cluscfg commands:

CLUSCFG DESCRIPTION
cluscfg delcreds Deletes the cached credentials for one or more specified users that the HPC Job Scheduler Service uses to submit jobs.
cluscfg listenvs Displays the values of the cluster-wide environment variables.
cluscfg listparams Displays the values of the cluster-wide parameters.
cluscfg setcreds Sets the credentials to use for the specified user when submitting jobs, and stores the credentials in the credential cache.
cluscfg setenvs Sets the values of one or more specified cluster-wide environment variables.
cluscfg setparams Sets the values of one or more specified cluster-wide parameters.
cluscfg view Displays statistics for the specified HPC cluster, including the name of the cluster, version of the HPC Pack installed on the cluster, and the number of nodes, cores, jobs, and tasks in various states.

See Windows HPC Server 2008 Command Reference for more command line commands

 Journal Files

A journal file contains a sequence of FLUENT commands, arranged as they would be typed interactively into the program or entered through the GUI. The GUI commands are recorded as Scheme code lines in journal files. FLUENT creates a journal file by recording everything you type on the command line or enter through the GUI. You can also create journal files manually with a text editor

The purpose of a journal file is to automate a series of commands instead of entering them repeatedly on the command line. Another use is to produce a record of the input to a program session for later reference, although transcript files are often more useful for this purpose.

Command input is taken from the specified journal file until its end is reached, at which time control is returned to the standard input (usually the keyboard). Each line from the journal file is echoed to the standard output (usually the screen) as it is read and processed. Refer to the FLUENT documentation for more information about how to create journal files.

 Transcript Files

A transcript file contains a complete record of all standard input to and output from FLUENT (usually all keyboard and GUI input and all screen output). GUI commands are recorded as Scheme code lines in transcript files. FLUENT creates a transcript file by recording everything typed as input or entered through the GUI, and everything printed as output in the text window.

The purpose of a transcript file is to produce a record of the program session for later reference. Because they contain messages and other output, transcript files (unlike journal files), cannot be read back into the program. Refer to the FLUENT documentation for more information about how to create transcript files.

1) We have a choice of 40 Gbit/s or 20Gbit/s infiniband. Running FLUENT can we take advantage of the 40Gbit/s speed on a cluster of 80 or less nodes or is that overkill. I have had some of our other software vendors tell me that 40Gbit is overkill for the amount of data they can move and 20Gbit is fine.

Comment: For jobs of 80 or less cores, 20 Gbit may be sufficient for FLUENT depending on the case size. But FLUENT scales well beyond 80 cores, where faster infiniband may help. In the future they may also want to run more cores. Depending on the price difference between 20/40Gbit/s infiniband, I'm not considering 40 Gbit/s as a overkiller for FLUENT, though we haven't done any quantitative comparison between these two speed of infiniband.

Updated (October 29, 2010)