Commit Graph

25287 Commits

Author SHA1 Message Date
Colin Cross
2bb3e31015 Merge commit 'v3.0.8' into android-3.0 2011-10-27 15:01:19 -07:00
Russell King - ARM Linux
6be918e0b2 mmc: block: allow get_card_status() to return error status
If the MMC_SEND_STATUS command is not successful, we should not return
a zero status word, but instead allow the caller to know positively
that an error occurred.

Convert the open-coded get_card_status() to use the helper function,
and provide definitions for the card state field.

Change-Id: Icfd6258af78a89c21abac386c556153fa3fac364
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Tested-by: Pawel Moll <pawel.moll@arm.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
2011-10-26 17:11:38 -07:00
Colin Cross
9e4ee8100f ram_console: pass in a boot info string
Allow the board file to pass a boot info string through the
platform data that is appended to the /proc/last_kmsg file.

Change-Id: I37065fafb09676085465c93384d8e176fdd942d6
Signed-off-by: Colin Cross <ccross@android.com>
2011-10-25 16:40:00 -07:00
Steven Rostedt
371d201b6d ftrace: Fix warning when CONFIG_FUNCTION_TRACER is not defined
commit 04da85b86188f224cc9b391b5bdd92a3ba20ffcf upstream.

The struct ftrace_hash was declared within CONFIG_FUNCTION_TRACER
but was referenced outside of it.

Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-10-16 14:14:55 -07:00
Steven Rostedt
d7f04c486e ftrace: Fix regression of :mod:module function enabling
commit 43dd61c9a09bd413e837df829e6bfb42159be52a upstream.

The new code that allows different utilities to pick and choose
what functions they trace broke the :mod: hook that allows users
to trace only functions of a particular module.

The reason is that the :mod: hook bypasses the hash that is setup
to allow individual users to trace their own functions and uses
the global hash directly. But if the global hash has not been
set up, it will cause a bug:

echo '*:mod:radeon' > /sys/kernel/debug/set_ftrace_filter

produces:

 [drm:drm_mode_getfb] *ERROR* invalid framebuffer id
 [drm:radeon_crtc_page_flip] *ERROR* failed to reserve new rbo buffer before flip
 BUG: unable to handle kernel paging request at ffffffff8160ec90
 IP: [<ffffffff810d9136>] add_hash_entry+0x66/0xd0
 PGD 1a05067 PUD 1a09063 PMD 80000000016001e1
 Oops: 0003 [#1] SMP Jul  7 04:02:28 phyllis kernel: [55303.858604] CPU 1
 Modules linked in: cryptd aes_x86_64 aes_generic binfmt_misc rfcomm bnep ip6table_filter hid radeon r8169 ahci libahci mii ttm drm_kms_helper drm video i2c_algo_bit intel_agp intel_gtt

 Pid: 10344, comm: bash Tainted: G        WC  3.0.0-rc5 #1 Dell Inc. Inspiron N5010/0YXXJJ
 RIP: 0010:[<ffffffff810d9136>]  [<ffffffff810d9136>] add_hash_entry+0x66/0xd0
 RSP: 0018:ffff88003a96bda8  EFLAGS: 00010246
 RAX: ffff8801301735c0 RBX: ffffffff8160ec80 RCX: 0000000000306ee0
 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880137c92940
 RBP: ffff88003a96bdb8 R08: ffff880137c95680 R09: 0000000000000000
 R10: 0000000000000001 R11: 0000000000000000 R12: ffffffff81c9df78
 R13: ffff8801153d1000 R14: 0000000000000000 R15: 0000000000000000
 FS: 00007f329c18a700(0000) GS:ffff880137c80000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: ffffffff8160ec90 CR3: 000000003002b000 CR4: 00000000000006e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
 Process bash (pid: 10344, threadinfo ffff88003a96a000, task ffff88012fcfc470)
 Stack:
  0000000000000fd0 00000000000000fc ffff88003a96be38 ffffffff810d92f5
  ffff88011c4c4e00 ffff880000000000 000000000b69f4d0 ffffffff8160ec80
  ffff8800300e6f06 0000000081130295 0000000000000282 ffff8800300e6f00
 Call Trace:
  [<ffffffff810d92f5>] match_records+0x155/0x1b0
  [<ffffffff810d940c>] ftrace_mod_callback+0xbc/0x100
  [<ffffffff810dafdf>] ftrace_regex_write+0x16f/0x210
  [<ffffffff810db09f>] ftrace_filter_write+0xf/0x20
  [<ffffffff81166e48>] vfs_write+0xc8/0x190
  [<ffffffff81167001>] sys_write+0x51/0x90
  [<ffffffff815c7e02>] system_call_fastpath+0x16/0x1b
 Code: 48 8b 33 31 d2 48 85 f6 75 33 49 89 d4 4c 03 63 08 49 8b 14 24 48 85 d2 48 89 10 74 04 48 89 42 08 49 89 04 24 4c 89 60 08 31 d2
 RIP [<ffffffff810d9136>] add_hash_entry+0x66/0xd0
  RSP <ffff88003a96bda8>
 CR2: ffffffff8160ec90
 ---[ end trace a5d031828efdd88e ]---

Reported-by: Brian Marete <marete@toshnix.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-10-16 14:14:55 -07:00
Richard Cochran
a94a36c352 ptp: fix L2 event message recognition
commit f75159e9936143177b442afc78150b7a7ad8aa07 upstream.

The IEEE 1588 standard defines two kinds of messages, event and general
messages. Event messages require time stamping, and general do not. When
using UDP transport, two separate ports are used for the two message
types.

The BPF designed to recognize event messages incorrectly classifies L2
general messages as event messages. This commit fixes the issue by
extending the filter to check the message type field for L2 PTP packets.
Event messages are be distinguished from general messages by testing
the "general" bit.

Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-10-16 14:14:53 -07:00
Peter Zijlstra
249cf808ba posix-cpu-timers: Cure SMP wobbles
commit d670ec13178d0fd8680e6742a2bc6e04f28f87d8 upstream.

David reported:

  Attached below is a watered-down version of rt/tst-cpuclock2.c from
  GLIBC.  Just build it with "gcc -o test test.c -lpthread -lrt" or
  similar.

  Run it several times, and you will see cases where the main thread
  will measure a process clock difference before and after the nanosleep
  which is smaller than the cpu-burner thread's individual thread clock
  difference.  This doesn't make any sense since the cpu-burner thread
  is part of the top-level process's thread group.

  I've reproduced this on both x86-64 and sparc64 (using both 32-bit and
  64-bit binaries).

  For example:

  [davem@boricha build-x86_64-linux]$ ./test
  process: before(0.001221967) after(0.498624371) diff(497402404)
  thread:  before(0.000081692) after(0.498316431) diff(498234739)
  self:    before(0.001223521) after(0.001240219) diff(16698)
  [davem@boricha build-x86_64-linux]$

  The diff of 'process' should always be >= the diff of 'thread'.

  I make sure to wrap the 'thread' clock measurements the most tightly
  around the nanosleep() call, and that the 'process' clock measurements
  are the outer-most ones.

  ---
  #include <unistd.h>
  #include <stdio.h>
  #include <stdlib.h>
  #include <time.h>
  #include <fcntl.h>
  #include <string.h>
  #include <errno.h>
  #include <pthread.h>

  static pthread_barrier_t barrier;

  static void *chew_cpu(void *arg)
  {
	  pthread_barrier_wait(&barrier);
	  while (1)
		  __asm__ __volatile__("" : : : "memory");
	  return NULL;
  }

  int main(void)
  {
	  clockid_t process_clock, my_thread_clock, th_clock;
	  struct timespec process_before, process_after;
	  struct timespec me_before, me_after;
	  struct timespec th_before, th_after;
	  struct timespec sleeptime;
	  unsigned long diff;
	  pthread_t th;
	  int err;

	  err = clock_getcpuclockid(0, &process_clock);
	  if (err)
		  return 1;

	  err = pthread_getcpuclockid(pthread_self(), &my_thread_clock);
	  if (err)
		  return 1;

	  pthread_barrier_init(&barrier, NULL, 2);
	  err = pthread_create(&th, NULL, chew_cpu, NULL);
	  if (err)
		  return 1;

	  err = pthread_getcpuclockid(th, &th_clock);
	  if (err)
		  return 1;

	  pthread_barrier_wait(&barrier);

	  err = clock_gettime(process_clock, &process_before);
	  if (err)
		  return 1;

	  err = clock_gettime(my_thread_clock, &me_before);
	  if (err)
		  return 1;

	  err = clock_gettime(th_clock, &th_before);
	  if (err)
		  return 1;

	  sleeptime.tv_sec = 0;
	  sleeptime.tv_nsec = 500000000;
	  nanosleep(&sleeptime, NULL);

	  err = clock_gettime(th_clock, &th_after);
	  if (err)
		  return 1;

	  err = clock_gettime(my_thread_clock, &me_after);
	  if (err)
		  return 1;

	  err = clock_gettime(process_clock, &process_after);
	  if (err)
		  return 1;

	  diff = process_after.tv_nsec - process_before.tv_nsec;
	  printf("process: before(%lu.%.9lu) after(%lu.%.9lu) diff(%lu)\n",
		 process_before.tv_sec, process_before.tv_nsec,
		 process_after.tv_sec, process_after.tv_nsec, diff);
	  diff = th_after.tv_nsec - th_before.tv_nsec;
	  printf("thread:  before(%lu.%.9lu) after(%lu.%.9lu) diff(%lu)\n",
		 th_before.tv_sec, th_before.tv_nsec,
		 th_after.tv_sec, th_after.tv_nsec, diff);
	  diff = me_after.tv_nsec - me_before.tv_nsec;
	  printf("self:    before(%lu.%.9lu) after(%lu.%.9lu) diff(%lu)\n",
		 me_before.tv_sec, me_before.tv_nsec,
		 me_after.tv_sec, me_after.tv_nsec, diff);

	  return 0;
  }

This is due to us using p->se.sum_exec_runtime in
thread_group_cputime() where we iterate the thread group and sum all
data. This does not take time since the last schedule operation (tick
or otherwise) into account. We can cure this by using
task_sched_runtime() at the cost of having to take locks.

This also means we can (and must) do away with
thread_group_sched_runtime() since the modified thread_group_cputime()
is now more accurate and would deadlock when called from
thread_group_sched_runtime().

Aside of that it makes the function safe on 32 bit systems. The old
code added t->se.sum_exec_runtime unprotected. sum_exec_runtime is a
64bit value and could be changed on another cpu at the same time.

Reported-by: David Miller <davem@davemloft.net>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1314874459.7945.22.camel@twins
Tested-by: David Miller <davem@davemloft.net>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-10-16 14:14:51 -07:00
Wu Fengguang
ac693061b1 writeback: introduce .tagged_writepages for the WB_SYNC_NONE sync stage
commit 6e6938b6d3130305a5960c86b1a9b21e58cf6144 upstream.

sync(2) is performed in two stages: the WB_SYNC_NONE sync and the
WB_SYNC_ALL sync. Identify the first stage with .tagged_writepages and
do livelock prevention for it, too.

Jan's commit f446daaea9 ("mm: implement writeback livelock avoidance
using page tagging") is a partial fix in that it only fixed the
WB_SYNC_ALL phase livelock.

Although ext4 is tested to no longer livelock with commit f446daaea9,
it may due to some "redirty_tail() after pages_skipped" effect which
is by no means a guarantee for _all_ the file systems.

Note that writeback_inodes_sb() is called by not only sync(), they are
treated the same because the other callers also need livelock prevention.

Impact:  It changes the order in which pages/inodes are synced to disk.
Now in the WB_SYNC_NONE stage, it won't proceed to write the next inode
until finished with the current inode.

Acked-by: Jan Kara <jack@suse.cz>
CC: Dave Chinner <david@fromorbit.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-10-03 11:40:43 -07:00
Mark Brown
7edcab4419 mfd: Fix value of WM8994_CONFIGURE_GPIO
commit 8efcc57dedfebc99c3cd39564e3fc47cd1a24b75 upstream.

This needs to be an out of band value for the register and on this device
registers are 16 bit so we must shift left one to the 17th bit.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-10-03 11:40:26 -07:00
John Stultz
1ed2053563 rtc: Fix RTC PIE frequency limit
commit 938f97bcf1bdd1b681d5d14d1d7117a2e22d4434 upstream.

Thomas earlier submitted a fix to limit the RTC PIE freq, but
picked 5000Hz out of the air. Willy noticed that we should
instead use the 8192Hz max from the rtc man documentation.

Cc: Willy Tarreau <w@1wt.eu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-10-03 11:40:08 -07:00
Jiri Slaby
a38df1a013 TTY: pty, fix pty counting
commit 24d406a6bf736f7aebdc8fa0f0ec86e0890c6d24 upstream.

tty_operations->remove is normally called like:
queue_release_one_tty
 ->tty_shutdown
   ->tty_driver_remove_tty
     ->tty_operations->remove

However tty_shutdown() is called from queue_release_one_tty() only if
tty_operations->shutdown is NULL. But for pty, it is not.
pty_unix98_shutdown() is used there as ->shutdown.

So tty_operations->remove of pty (i.e. pty_unix98_remove()) is never
called. This results in invalid pty_count. I.e. what can be seen in
/proc/sys/kernel/pty/nr.

I see this was already reported at:
  https://lkml.org/lkml/2009/11/5/370
But it was not fixed since then.

This patch is kind of a hackish way. The problem lies in ->install. We
allocate there another tty (so-called tty->link). So ->install is
called once, but ->remove twice, for both tty and tty->link. The fix
here is to count both tty and tty->link and divide the count by 2 for
user.

And to have ->remove called, let's make tty_driver_remove_tty() global
and call that from pty_unix98_shutdown() (tty_operations->shutdown).

While at it, let's document that when ->shutdown is defined,
tty_shutdown() is not called.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Alan Cox <alan@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-10-03 11:39:49 -07:00
Alexandre Bounine
ae0250b118 rapidio: fix use of non-compatible registers
commit 284fb68d00c56e971ed01e0b4bac5ddd4d1b74ab upstream.

Replace/remove use of RIO v.1.2 registers/bits that are not
forward-compatible with newer versions of RapidIO specification.

RapidIO specification v.1.3 removed Write Port CSR, Doorbell CSR,
Mailbox CSR and Mailbox and Doorbell bits of the PEF CAR.

Use of removed (since RIO v.1.3) register bits affects users of
currently available 1.3 and 2.x compliant devices who may use not so
recent kernel versions.

Removing checks for unsupported bits makes corresponding routines
compatible with all versions of RapidIO specification.  Therefore,
backporting makes stable kernel versions compliant with RIO v.1.3 and
later as well.

Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Kumar Gala <galak@kernel.crashing.org>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Li Yang <leoli@freescale.com>
Cc: Thomas Moll <thomas.moll@sysgo.com>
Cc: Chul Kim <chul.kim@idt.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-10-03 11:39:46 -07:00
Jaikumar Ganesh
8bf44e90c8 HID: Add input_register callback.
Add input_register callback which gets called after
hid_configure_usage is called for all the reports
and before the input device is registered. This allows
individual drivers to do extra work like input mapping just
before device registration.

Based on discussions with David Herrmann <dh.herrmann@googlemail.com>

Change-Id: Idab6fb4f7b1e5e569bd0410967288717e9d34c98
Signed-off-by: Jaikumar Ganesh <jaikumarg@android.com>
2011-09-21 13:47:49 -07:00
Dima Zavin
6b82151ce5 usb: otg_id: add suspend/resume interface
It is possible that while one driver has already suspended,
another driver calls otg_id_notify() because it has not yet been
suspended. It would then be possible for the suspended driver's
detect callback to be called. This is undesirable.

Introduce new otg_id_suspend/otg_id_resume functions that
keep a suspended count, and if a notification happens while
someone is suspended, that notification is deferred until
all the drivers are resumed. If the notification happens before
the last driver is suspended, that suspend will be aborted
and once the final driver resumes through otg_id_resume, the
notification will be delivered.

Change-Id: I32fd32bec65e366e5f97a25c15255d94773b85b3
Signed-off-by: Dima Zavin <dima@android.com>
2011-09-11 12:38:48 -07:00
Colin Cross
07b26cd3da mmc: core: host: only use wakelock for detect work
There is no need to take a wakelock for delayed lazy disable
work, it will be cancelled in the suspend handler and force
disabled.  Only take the wakelock when the detect work is
queued, and make sure to drop the wakelock if the work is
cancelled.

Change-Id: I1e507a5f98848954ea21d45e23b6192c3132a349
Signed-off-by: Colin Cross <ccross@android.com>
2011-09-07 17:30:23 -07:00
Andi Kleen
512228f0be Add a personality to report 2.6.x version numbers
commit be27425dcc516fd08245b047ea57f83b8f6f0903 upstream.

I ran into a couple of programs which broke with the new Linux 3.0
version.  Some of those were binary only.  I tried to use LD_PRELOAD to
work around it, but it was quite difficult and in one case impossible
because of a mix of 32bit and 64bit executables.

For example, all kind of management software from HP doesnt work, unless
we pretend to run a 2.6 kernel.

  $ uname -a
  Linux svivoipvnx001 3.0.0-08107-g97cd98f #1062 SMP Fri Aug 12 18:11:45 CEST 2011 i686 i686 i386 GNU/Linux

  $ hpacucli ctrl all show

  Error: No controllers detected.

  $ rpm -qf /usr/sbin/hpacucli
  hpacucli-8.75-12.0

Another notable case is that Python now reports "linux3" from
sys.platform(); which in turn can break things that were checking
sys.platform() == "linux2":

  https://bugzilla.mozilla.org/show_bug.cgi?id=664564

It seems pretty clear to me though it's a bug in the apps that are using
'==' instead of .startswith(), but this allows us to unbreak broken
programs.

This patch adds a UNAME26 personality that makes the kernel report a
2.6.40+x version number instead.  The x is the x in 3.x.

I know this is somewhat ugly, but I didn't find a better workaround, and
compatibility to existing programs is important.

Some programs also read /proc/sys/kernel/osrelease.  This can be worked
around in user space with mount --bind (and a mount namespace)

To use:

  wget ftp://ftp.kernel.org/pub/linux/kernel/people/ak/uname26/uname26.c
  gcc -o uname26 uname26.c
  ./uname26 program

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-08-29 13:29:16 -07:00
Suresh Siddha
6857336c7f x86, mtrr: lock stop machine during MTRR rendezvous sequence
commit 6d3321e8e2b3bf6a5892e2ef673c7bf536e3f904 upstream.

MTRR rendezvous sequence using stop_one_cpu_nowait() can potentially
happen in parallel with another system wide rendezvous using
stop_machine(). This can lead to deadlock (The order in which
works are queued can be different on different cpu's. Some cpu's
will be running the first rendezvous handler and others will be running
the second rendezvous handler. Each set waiting for the other set to join
for the system wide rendezvous, leading to a deadlock).

MTRR rendezvous sequence is not implemented using stop_machine() as this
gets called both from the process context aswell as the cpu online paths
(where the cpu has not come online and the interrupts are disabled etc).
stop_machine() works with only online cpus.

For now, take the stop_machine mutex in the MTRR rendezvous sequence that
gets called from an online cpu (here we are in the process context
and can potentially sleep while taking the mutex). And the MTRR rendezvous
that gets triggered during cpu online doesn't need to take this stop_machine
lock (as the stop_machine() already ensures that there is no cpu hotplug
going on in parallel by doing get_online_cpus())

    TBD: Pursue a cleaner solution of extending the stop_machine()
         infrastructure to handle the case where the calling cpu is
         still not online and use this for MTRR rendezvous sequence.

fixes: https://bugzilla.novell.com/show_bug.cgi?id=672008

Reported-by: Vadim Kotelnikov <vadimuzzz@inbox.ru>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Link: http://lkml.kernel.org/r/20110623182056.807230326@sbsiddha-MOBL3.sc.intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-08-29 13:29:08 -07:00
Nishanth Menon
efcf89c8eb panic: Add board ID to panic output
At times, it is necessary for boards to provide some additional information
as part of panic logs. Provide information on the board hardware as part
of panic logs.

It is safer to print this information at the very end in case something
bad happens as part of the information retrieval itself.

To use this, set global mach_panic_string to an appropriate string in the
board file.

Change-Id: Id12cdda87b0cd2940dd01d52db97e6162f671b4d
Signed-off-by: Nishanth Menon <nm@ti.com>
2011-08-19 11:41:00 -07:00
Peter Zijlstra
0db4b32c1f mm: Fix fixup_user_fault() for MMU=n
commit 5c723ba5b7886909b2e430f2eae454c33f7fe5c6 upstream.

In commit 2efaca927f5c ("mm/futex: fix futex writes on archs with SW
tracking of dirty & young") we forgot about MMU=n.  This patch fixes
that.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: David Howells <dhowells@redhat.com>
Link: http://lkml.kernel.org/r/1311761831.24752.413.camel@twins
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Frysinger <vapier.adi@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-08-15 18:31:43 -07:00
Neil Horman
60f17a7798 net: add IFF_SKB_TX_SHARED flag to priv_flags
[ Upstream commit d8873315065f1f527c7c380402cf59b1e1d0ae36 ]

Pktgen attempts to transmit shared skbs to net devices, which can't be used by
some drivers as they keep state information in skbs.  This patch adds a flag
marking drivers as being able to handle shared skbs in their tx path.  Drivers
are defaulted to being unable to do so, but calling ether_setup enables this
flag, as 90% of the drivers calling ether_setup touch real hardware and can
handle shared skbs.  A subsequent patch will audit drivers to ensure that the
flag is set properly

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Reported-by: Jiri Pirko <jpirko@redhat.com>
CC: Robert Olsson <robert.olsson@its.uu.se>
CC: Eric Dumazet <eric.dumazet@gmail.com>
CC: Alexey Dobriyan <adobriyan@gmail.com>
CC: David S. Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-08-15 18:31:38 -07:00
David S. Miller
e997d47bff net: Compute protocol sequence numbers and fragment IDs using MD5.
Computers have become a lot faster since we compromised on the
partial MD4 hash which we use currently for performance reasons.

MD5 is a much safer choice, and is inline with both RFC1948 and
other ISS generators (OpenBSD, Solaris, etc.)

Furthermore, only having 24-bits of the sequence number be truly
unpredictable is a very serious limitation.  So the periodic
regeneration and 8-bit counter have been removed.  We compute and
use a full 32-bit sequence number.

For ipv6, DCCP was found to use a 32-bit truncated initial sequence
number (it needs 43-bits) and that is fixed here as well.

Reported-by: Dan Kaminsky <dan@doxpara.com>
Tested-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-08-15 18:31:35 -07:00
David S. Miller
2468b895fc crypto: Move md5_transform to lib/md5.c
We are going to use this for TCP/IP sequence number and fragment ID
generation.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-08-15 18:31:35 -07:00
Colin Cross
9abd59b0df Merge commit 'v3.0.1' into android-3.0 2011-08-10 18:07:59 -07:00
Colin Cross
4e111751cf Merge commit 'v3.0' into android-3.0 2011-08-10 18:04:30 -07:00
Erik Gilling
e65d9fe556 usb: otg: add proxy_wait handler to otg_id
Some otg_id handlers can detect what's connected but can't detect a change.
This allows that handler to pass off the waiting for ID change to a proxy.

Change-Id: Ib38b750c3da4bffc35e37b620ecee37c5d64d31f
Signed-off-by: Erik Gilling <konkers@android.com>
2011-08-09 17:58:49 -07:00
Iliyan Malchev
3fe24366a4 ion: minor clean up
-- init rb nodes in ion_handle_create
-- in ion_handle_destroy, check that a node belongs to a tree before removing
   it (safety check, does not happen right now)
-- mark as static functions used only inside ion.c
-- update comments to ion_share() with a relevant blurb from the implementation
-- other minor updates/typo fixes to comments

Signed-off-by: Iliyan Malchev <malchev@google.com>
2011-08-09 14:49:21 -07:00
Trond Myklebust
c14acb19a4 NFS: Fix spurious readdir cookie loop messages
commit 0c0308066ca53fdf1423895f3a42838b67b3a5a8 upstream.

If the directory contents change, then we have to accept that the
file->f_pos value may shrink if we do a 'search-by-cookie'. In that
case, we should turn off the loop detection and let the NFS client
try to recover.

The patch also fixes a second loop detection bug by ensuring
that after turning on the ctx->duped flag, we read at least one new
cookie into ctx->dir_cookie before attempting to match with
ctx->dup_cookie.

Reported-by: Petr Vandrovec <petr@vandrovec.name>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-08-04 21:58:40 -07:00
Benjamin Herrenschmidt
b045b9a265 mm/futex: fix futex writes on archs with SW tracking of dirty & young
commit 2efaca927f5cd7ecd0f1554b8f9b6a9a2c329c03 upstream.

I haven't reproduced it myself but the fail scenario is that on such
machines (notably ARM and some embedded powerpc), if you manage to hit
that futex path on a writable page whose dirty bit has gone from the PTE,
you'll livelock inside the kernel from what I can tell.

It will go in a loop of trying the atomic access, failing, trying gup to
"fix it up", getting succcess from gup, go back to the atomic access,
failing again because dirty wasn't fixed etc...

So I think you essentially hang in the kernel.

The scenario is probably rare'ish because affected architecture are
embedded and tend to not swap much (if at all) so we probably rarely hit
the case where dirty is missing or young is missing, but I think Shan has
a piece of SW that can reliably reproduce it using a shared writable
mapping & fork or something like that.

On archs who use SW tracking of dirty & young, a page without dirty is
effectively mapped read-only and a page without young unaccessible in the
PTE.

Additionally, some architectures might lazily flush the TLB when relaxing
write protection (by doing only a local flush), and expect a fault to
invalidate the stale entry if it's still present on another processor.

The futex code assumes that if the "in_atomic()" access -EFAULT's, it can
"fix it up" by causing get_user_pages() which would then be equivalent to
taking the fault.

However that isn't the case.  get_user_pages() will not call
handle_mm_fault() in the case where the PTE seems to have the right
permissions, regardless of the dirty and young state.  It will eventually
update those bits ...  in the struct page, but not in the PTE.

Additionally, it will not handle the lazy TLB flushing that can be
required by some architectures in the fault case.

Basically, gup is the wrong interface for the job.  The patch provides a
more appropriate one which boils down to just calling handle_mm_fault()
since what we are trying to do is simulate a real page fault.

The futex code currently attempts to write to user memory within a
pagefault disabled section, and if that fails, tries to fix it up using
get_user_pages().

This doesn't work on archs where the dirty and young bits are maintained
by software, since they will gate access permission in the TLB, and will
not be updated by gup().

In addition, there's an expectation on some archs that a spurious write
fault triggers a local TLB flush, and that is missing from the picture as
well.

I decided that adding those "features" to gup() would be too much for this
already too complex function, and instead added a new simpler
fixup_user_fault() which is essentially a wrapper around handle_mm_fault()
which the futex code can call.

[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: fix some nits Darren saw, fiddle comment layout]
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reported-by: Shan Hai <haishan.bai@gmail.com>
Tested-by: Shan Hai <haishan.bai@gmail.com>
Cc: David Laight <David.Laight@ACULAB.COM>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Darren Hart <darren.hart@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-08-04 21:58:38 -07:00
Peng Tao
892cd4a38f pnfs: let layoutcommit handle a list of lseg
commit a9bae5666d0510ad69bdb437371c9a3e6b770705 upstream.

There can be multiple lseg per file, so layoutcommit should be
able to handle it.

[Needed in v3.0]
Signed-off-by: Peng Tao <peng_tao@emc.com>
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Jim Rees <rees@umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-08-04 21:58:37 -07:00
Stefan Richter
6f43778391 firewire: cdev: prevent race between first get_info ioctl and bus reset event queuing
commit 93b37905f70083d6143f5f4dba0a45cc64379a62 upstream.

Between open(2) of a /dev/fw* and the first FW_CDEV_IOC_GET_INFO
ioctl(2) on it, the kernel already queues FW_CDEV_EVENT_BUS_RESET events
to be read(2) by the client.  The get_info ioctl is practically always
issued right away after open, hence this condition only occurs if the
client opens during a bus reset, especially during a rapid series of bus
resets.

The problem with this condition is twofold:

  - These bus reset events carry the (as yet undocumented) @closure
    value of 0.  But it is not the kernel's place to choose closures;
    they are privat to the client.  E.g., this 0 value forced from the
    kernel makes it unsafe for clients to dereference it as a pointer to
    a closure object without NULL pointer check.

  - It is impossible for clients to determine the relative order of bus
    reset events from get_info ioctl(2) versus those from read(2),
    except in one way:  By comparison of closure values.  Again, such a
    procedure imposes complexity on clients and reduces freedom in use
    of the bus reset closure.

So, change the ABI to suppress queuing of bus reset events before the
first FW_CDEV_IOC_GET_INFO ioctl was issued by the client.

Note, this ABI change cannot be version-controlled.  The kernel cannot
distinguish old from new clients before the first FW_CDEV_IOC_GET_INFO
ioctl.

We will try to back-merge this change into currently maintained stable/
longterm series, and we only document the new behaviour.  The old
behavior is now considered a kernel bug, which it basically is.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Cc: <stable@kernel.org>
2011-08-04 21:58:34 -07:00
Herbert Xu
97edbc9012 gro: Only reset frag0 when skb can be pulled
commit 17dd759c67f21e34f2156abcf415e1f60605a188 upstream.

Currently skb_gro_header_slow unconditionally resets frag0 and
frag0_len.  However, when we can't pull on the skb this leaves
the GRO fields in an inconsistent state.

This patch fixes this by only resetting those fields after the
pskb_may_pull test.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-08-04 21:58:31 -07:00
JP Abgrall
0b893f0f37 netfilter: xt_qtaguid: add uid permission checks during ctrl/stats access
* uid handling
 - Limit UID impersonation to processes with a gid in AID_NET_BW_ACCT.
   This affects socket tagging, and data removal.
 - Limit stats lookup to own uid or the process gid is in AID_NET_BW_STATS.
   This affects stats lookup.

* allow pacifying the module
  Setting passive to Y/y will make the module return immediately on
  external stimulus.
  No more stats and silent success on ctrl writes.
  Mainly used when one suspects this module of misbehaving.

Change-Id: I83990862d52a9b0922aca103a0f61375cddeb7c4
Signed-off-by: JP Abgrall <jpa@google.com>
2011-07-21 18:04:46 -07:00
Mike Lockwood
adf0760ceb USB: gadget: f_accessory: Add ioctl to detect USB accessory attached at boot
Signed-off-by: Mike Lockwood <lockwood@android.com>
2011-07-21 15:26:03 -07:00
Linus Torvalds
cf6ace16a3 Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  signal: align __lock_task_sighand() irq disabling and RCU
  softirq,rcu: Inform RCU of irq_exit() activity
  sched: Add irq_{enter,exit}() to scheduler_ipi()
  rcu: protect __rcu_read_unlock() against scheduler-using irq handlers
  rcu: Streamline code produced by __rcu_read_unlock()
  rcu: Fix RCU_BOOST race handling current->rcu_read_unlock_special
  rcu: decrease rcu_report_exp_rnp coupling with scheduler
2011-07-20 15:56:25 -07:00
Ingo Molnar
d1e9ae47a0 Merge branch 'rcu/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-2.6-rcu into core/urgent 2011-07-20 20:59:26 +02:00
Peter Zijlstra
e3589f6c81 sched: Allow for overlapping sched_domain spans
Allow for sched_domain spans that overlap by giving such domains their
own sched_group list instead of sharing the sched_groups amongst
each-other.

This is needed for machines with more than 16 nodes, because
sched_domain_node_span() will generate a node mask from the
16 nearest nodes without regard if these masks have any overlap.

Currently sched_domains have a sched_group that maps to their child
sched_domain span, and since there is no overlap we share the
sched_group between the sched_domains of the various CPUs. If however
there is overlap, we would need to link the sched_group list in
different ways for each cpu, and hence sharing isn't possible.

In order to solve this, allocate private sched_groups for each CPU's
sched_domain but have the sched_groups share a sched_group_power
structure such that we can uniquely track the power.

Reported-and-tested-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/n/tip-08bxqw9wis3qti9u5inifh3y@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-20 18:32:41 +02:00
Peter Zijlstra
9c3f75cbd1 sched: Break out cpu_power from the sched_group structure
In order to prepare for non-unique sched_groups per domain, we need to
carry the cpu_power elsewhere, so put a level of indirection in.

Reported-and-tested-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/n/tip-qkho2byuhe4482fuknss40ad@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-20 18:32:40 +02:00
Paul E. McKenney
7765be2fec rcu: Fix RCU_BOOST race handling current->rcu_read_unlock_special
The RCU_BOOST commits for TREE_PREEMPT_RCU introduced an other-task
write to a new RCU_READ_UNLOCK_BOOSTED bit in the task_struct structure's
->rcu_read_unlock_special field, but, as noted by Steven Rostedt, without
correctly synchronizing all accesses to ->rcu_read_unlock_special.
This could result in bits in ->rcu_read_unlock_special being spuriously
set and cleared due to conflicting accesses, which in turn could result
in deadlocks between the rcu_node structure's ->lock and the scheduler's
rq and pi locks.  These deadlocks would result from RCU incorrectly
believing that the just-ended RCU read-side critical section had been
preempted and/or boosted.  If that RCU read-side critical section was
executed with either rq or pi locks held, RCU's ensuing (incorrect)
calls to the scheduler would cause the scheduler to attempt to once
again acquire the rq and pi locks, resulting in deadlock.  More complex
deadlock cycles are also possible, involving multiple rq and pi locks
as well as locks from multiple rcu_node structures.

This commit fixes synchronization by creating ->rcu_boosted field in
task_struct that is accessed and modified only when holding the ->lock
in the rcu_node structure on which the task is queued (on that rcu_node
structure's ->blkd_tasks list).  This results in tasks accessing only
their own current->rcu_read_unlock_special fields, making unsynchronized
access once again legal, and keeping the rcu_read_unlock() fastpath free
of atomic instructions and memory barriers.

The reason that the rcu_read_unlock() fastpath does not need to access
the new current->rcu_boosted field is that this new field cannot
be non-zero unless the RCU_READ_UNLOCK_BLOCKED bit is set in the
current->rcu_read_unlock_special field.  Therefore, rcu_read_unlock()
need only test current->rcu_read_unlock_special: if that is zero, then
current->rcu_boosted must also be zero.

This bug does not affect TINY_PREEMPT_RCU because this implementation
of RCU accesses current->rcu_read_unlock_special with irqs disabled,
thus preventing races on the !SMP systems that TINY_PREEMPT_RCU runs on.

Maybe-reported-by: Dave Jones <davej@redhat.com>
Maybe-reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
2011-07-19 21:38:52 -07:00
WANG Cong
a07c7964a2 include/linux/sdla.h: remove the prototype of sdla()
`make headers_check` complains that

linux-2.6/usr/include/linux/sdla.h:116: userspace cannot reference
function or variable defined in the kernel

this is due to that there is no such a kernel function,

void sdla(void *cfg_info, char *dev, struct frad_conf *conf, int quiet);

I don't know why we have it in a kernel header, so remove it.

Signed-off-by: WANG Cong <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-18 11:06:03 -07:00
Todd Poynor
9a736a2169 Move x86_64 idle notifiers to generic
Move the x86_64 idle notifiers originally by Andi Kleen and Venkatesh
Pallipadi to generic.

Change-Id: Idf29cda15be151f494ff245933c12462643388d5
Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: Todd Poynor <toddpoynor@google.com>
2011-07-15 15:26:45 -07:00
Mike Lockwood
ce4022b8d1 USB: gadget: f_mtp: Add support for sending MTP header during file transfer
MTP_SEND_FILE_WITH_HEADER ioctl allows sending a file with the 12 byte header
prepended at the beginning.
This is to allow MTP to use a single packet for the data phase instead of two.

Signed-off-by: Mike Lockwood <lockwood@android.com>
2011-07-15 17:18:14 -04:00
Michał Mirosław
62f2a3a48b net: remove NETIF_F_ALL_TX_OFFLOADS
There is no software fallback implemented for SCTP or FCoE checksumming,
and so it should not be passed on by software devices like bridge or bonding.

For VLAN devices, this is different. First, the driver for underlying device
should be prepared to get offloaded packets even when the feature is disabled
(especially if it advertises it in vlan_features). Second, devices under
VLANs do not get replaced without tearing down the VLAN first.

This fixes a mess I accidentally introduced while converting bonding to
ndo_fix_features.

NETIF_F_SOFT_FEATURES are removed from BOND_VLAN_FEATURES because they
are unused as of commit 712ae51afd.

Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-14 15:18:49 -07:00
Linus Torvalds
51414d4108 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc:
  mmc: core: Bus width testing needs to handle suspend/resume
2011-07-13 16:47:31 -07:00
Philip Rakity
f39b2dd9d0 mmc: core: Bus width testing needs to handle suspend/resume
On reading the ext_csd for the first time (in 1 bit mode), save the
ext_csd information needed for bus width compare.

On every pass we make re-reading the ext_csd, compare the data
against the saved ext_csd data.

This fixes a regression introduced in 3.0-rc1 by 08ee80cc39
("mmc: core: eMMC bus width may not work on all platforms"), which
incorrectly assumed we would be re-reading the ext_csd at resume-
time.

Signed-off-by: Philip Rakity <prakity@marvell.com>
Tested-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
2011-07-13 14:54:37 -04:00
Colin Cross
bb5b603036 cgroup: Add generic cgroup subsystem permission checks
Rather than using explicit euid == 0 checks when trying to move
tasks into a cgroup via CFS, move permission checks into each
specific cgroup subsystem. If a subsystem does not specify a
'allow_attach' handler, then we fall back to doing our checks
the old way.

Use the 'allow_attach' handler for the 'cpu' cgroup to allow
non-root processes to add arbitrary processes to a 'cpu' cgroup
if it has the CAP_SYS_NICE capability set.

This version of the patch adds a 'allow_attach' handler instead
of reusing the 'can_attach' handler.  If the 'can_attach' handler
is reused, a new cgroup that implements 'can_attach' but not
the permission checks could end up with no permission checks
at all.

Change-Id: Icfa950aa9321d1ceba362061d32dc7dfa2c64f0c
Original-Author: San Mehat <san@google.com>
Signed-off-by: Colin Cross <ccross@android.com>
2011-07-12 22:58:35 -07:00
Colin Cross
75c56a8111 Merge commit 'v3.0-rc7' into android-3.0 2011-07-12 20:10:37 -07:00
Linus Torvalds
8d86e5f914 Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  powerpc/mm: Fix memory_block_size_bytes() for non-pseries
  mm: Move definition of MIN_MEMORY_BLOCK_SIZE to a header
2011-07-12 14:21:19 -07:00
Linus Torvalds
d93a881dd7 Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/linux-arm-soc
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/linux-arm-soc:
  pcmcia: pxa2xx/vpac270: free gpios on exist rather than requesting
  ARM: pxa/raumfeld: fix device name for codec ak4104
  ARM: pxa/raumfeld: display initialisation fixes
  ARM: pxa/raumfeld: adapt to upcoming hardware change
  ARM: pxa: fix gpio_to_chip() clash with gpiolib namespace
  genirq: replace irq_gc_ack() with {set,clr}_bit variants (fwd)
  arm: mach-vt8500: add forgotten irq_data conversion
  ARM: pxa168: correct nand pmu setting
  ARM: pxa910: correct nand pmu setting
  ARM: pxa: fix PGSR register address calculation
2011-07-12 14:19:51 -07:00
Benjamin Herrenschmidt
a63fdc5156 mm: Move definition of MIN_MEMORY_BLOCK_SIZE to a header
The macro MIN_MEMORY_BLOCK_SIZE is currently defined twice in two .c
files, and I need it in a third one to fix a powerpc bug, so let's
first move it into a header

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
2011-07-12 11:08:01 +10:00
JP Abgrall
511b7e3208 netfilter: adding the original quota2 from xtables-addons
The original xt_quota in the kernel is plain broken:
  - counts quota at a per CPU level
    (was written back when ubiquitous SMP was just a dream)
  - provides no way to count across IPV4/IPV6.

This patch is the original unaltered code from:
  http://sourceforge.net/projects/xtables-addons

  at commit e84391ce665cef046967f796dd91026851d6bbf3

Change-Id: I19d49858840effee9ecf6cff03c23b45a97efdeb
Signed-off-by: JP Abgrall <jpa@google.com>
2011-07-11 16:39:01 -07:00