Thin and light laptops are great for an urban nomadic lifestyle of planes and coffee houses with WiFi. But the cooling is marginal for large compiles/builds (e.g., the kernel) and totally inadequate for CPU- and GPU-intensive games like recent versions of Doom or Quake. The thermal protection kicks before the first level is completed. My interim solution has been to run the original Doom, it runs hot but not too hot. Given that laptops are expected to outsell desktops this year, I expect I am not alone in my frustration at not being able to play games and worrying about frying my laptop.
For a non-realtime program like a large compile/build, running part-time (run a bit, pause a bit, repeat) seems like a reasonable approach. While there are some research efforts using the throttling built into the Linux kernel and Intel CPUs, most are thermal management oriented and throttle down all processes. I want just the CPU intensive program throttled. And I like to keep my programs as hardware and operating system independent as possible. Plus research code in my only computer's operating system makes me nervous.
Many mechanical and electronic systems run part-time -- air conditioning, for instance. The dutycycle is the fraction of time the system is on, typically expressed as a percentage, from 0 percent (always off) to 100 percent (always on). A related metric is the length of an on-off cycle, the cycle time. The dutycycle program in Listing 1 lets you control both. At the default setting of 50 percent dutycycle and a 100 millisecond cycle time (i.e., on for 50 millisecond, off for 50 milliseconds), the CPU temperature rise during kernel builds is cut almost in half.
#include <errno.h> #include <limits.h> #include <math.h> #include <stdio.h> #include <stdlib.h> #include <time.h> #include <unistd.h> #include <sys/signal.h> #include <sys/types.h> #include <sys/wait.h> #include <assert.h> #define VERSION "1.0.0" static volatile pid_t pgrp; /* process group of children */ /* catch-all signal handler, terminates all child processes */ static void terminate_children(int signum) { signal(signum, SIG_DFL); kill(-pgrp, signum); raise(signum); } int main(int argc, char* const argv[]) { int c; pid_t childpid; int errflg = 0; int status; int debug = 0; long dutycycle = 50; /* default: 50% */ double cycle_time = 0.100; /* default: 100 milliseconds */ while ((c = getopt(argc, argv, "+p:c:vD")) != -1) { char *endptr; switch (c) { case 'p': /* duty cycle percentage */ dutycycle = strtol(optarg, &endptr, 10); if (0 < dutycycle && dutycycle <= 100 && *endptr == '\0') ; else { fprintf(stderr, "dutycycle must be integer, 1 to 100 (percent)\n"); ++errflg; } break; case 'c': cycle_time = strtod(optarg, &endptr); if (0.0 < cycle_time && *endptr == '\0') ; else { fprintf(stderr, "cycle time must be a positive floating point number\n"); ++errflg; } break; case 'v': printf("Version: %s\n", VERSION); exit(EXIT_SUCCESS); case 'D': debug = 1; break; case ':': /* option without operand */ fprintf(stderr, "Option -%c requires an operand\n", optopt); errflg++; break; case '?': fprintf(stderr, "Unrecognized option: -%c\n", optopt); errflg++; break; default: assert(0); } } if (errflg) { fprintf(stderr, "usage: dutycycle [-p #] [-c #] [-v] [-D] command ...\n"); exit(2); } if((childpid = fork()) == -1) { perror("fork"); exit(EXIT_FAILURE); } if(childpid == 0) { /* child */ status = setpgrp(); if (status != 0) { perror(argv[0]); exit(EXIT_FAILURE); } if (debug) { fprintf(stderr, "CHILD: process = %d, group = %d, session = %d\n", getpid(), getpgrp(), getsid()); } execvp(argv[optind], argv + optind); } else { /* parent */ double whole_seconds; double fractional_seconds; pgrp = getpgid(childpid); /* child(ren) process group */ if (signal(SIGTERM, &terminate_children) == SIG_ERR) { perror(argv[0]); exit(EXIT_FAILURE); } if (signal(SIGQUIT, &terminate_children) == SIG_ERR) { perror(argv[0]); exit(EXIT_FAILURE); } if (signal(SIGINT, &terminate_children) == SIG_ERR) { perror(argv[0]); exit(EXIT_FAILURE); } fractional_seconds = modf(cycle_time * dutycycle/100.0, &whole_seconds); struct timespec const on_time = { (time_t)(whole_seconds), (long)(1.0e9 * fractional_seconds) }; fractional_seconds = modf(cycle_time * (100-dutycycle)/100.0, &whole_seconds); struct timespec const off_time = { (time_t)(whole_seconds), (long)(1.0e9 * fractional_seconds) }; if (debug) { fprintf(stderr, "%fs cycle, %d%% on.\n", cycle_time, dutycycle); fprintf(stderr, "Process %d is controlling child process %d\n", getpid(), childpid); fprintf(stderr, "PARENT: process = %d, group = %d, session = %d\n", getpid(), getpgid(), getsid()); } close(0); /* close stdin */ close(1); /* close stdout */ /* while forked child process running ... */ while (waitpid(childpid, &status, WNOHANG)== 0) { struct timespec remainder; /* sleep until on-cycle time up */ remainder = on_time; /* keep calling nanosleep until not interupted */ do { status = nanosleep(&remainder, &remainder); } while (status == -1 && errno == EINTR); if (status != 0) { perror(argv[0]); exit(EXIT_FAILURE); } if (debug) fprintf(stderr, "STOP\n"); /* pause child process group */ status = kill(-pgrp, SIGSTOP); if (status != 0) { perror(argv[0]); exit(EXIT_FAILURE); } /* sleep until off-cycle time complete */ remainder = off_time; do { status = nanosleep(&remainder, &remainder); } while (status == -1 && errno == EINTR); if (status != 0) { perror(argv[0]); exit(EXIT_FAILURE); } if (debug) fprintf(stderr, "CONT\n"); /* resume the child process group */ status = kill(-pgrp, SIGCONT); if (status != 0) { perror(argv[0]); exit(EXIT_FAILURE); } } } return EXIT_SUCCESS; }
The basic idea of dutycycle is to fork off the controlled program, sleep for a bit, pause the controlled program, sleep a bit more, then resume it, repeating until the controlled program exits. The POSIX standard signals STOP and CONT (continue), respectively pause and resume a process. That is OS independent enough for me and requires no experimental kernel code. This code has only been tested on Linux, but should work on most Unix work-alikes and other sufficiently POSIX-compliant operating systems. (POSIX is a bunch of standards, not just one; how many parts and how well implemented they are varies.)
As is frequently the case, the swamp turned out easier to get into than get out of. For example, the make program recursively spawns child processes for each level of subdirectories and to run the compiler/linker/etc. Starting and stopping the make process will not effect any child processes. However, signalling the whole process group will affect them all.
Just when you think you have the denizens of the swamp (i.e, the STOP and CONTsignals) tamed, their siblings (TERM, QUIT, and INT) show up and want attention too. Dutycycle needs to catch these and pass them on to the whole process group so the whole mess can be gracefully stopped. Otherwise, dutycycle exits but the controlled program(s) keeps running. The terminate_children() handler does this. Another complication is that the nanosleep() calls can be interrupted and must be restarted, hence the inner loops.
Dutycyle is invoked like the nice program:
dutycycle [OPTION] [COMMAND [ARG]...]
The program, it's options if any, and the controlled program and it's options and arguments. Dutycycle's options are:
-p # |
on percentage, integer between 1 and 100, default 50. |
-c #.# |
full on/off cycle time, positive floating point number in seconds, |
|
default 0.100, i.e., 100 milliseconds. |
-v |
print version number on standard output and exit. |
-D |
enable debug information printout. |
-? |
print usage message and exit. |